NWL  Technical  Report 
TR-2108 


THE  METHOD  AND  USE  OF  NOVACCJM,  A  PROGRAM  FOR 
"NON -ORTHOGONAL"  ANALYSIS  OF  VARIANCE  AND  COVARIANCE 

By 

Klaus  Abt 


19  April  1968 


Best 

Available 

Copy 


:''f"M'nisi|m||.  Mttwwr'W-,:  <*** 


fj 


ABSTRACT 

The  report  contains  the  description  of  a  program  ("NOVACCM" )  for 
the  solution  of  problems  in  the  area  of  analyzing  data  baaed  on  the 
general  linear  statistical  model.  While  the  detailed  program  documentation 
is  given  elsewhere,  the  present  publication  deals  with  the  statistical 
method,  the  logical  flow,  and  the  use  and  application  of  NOVACCM  in 
multiple  linear  regression  and  ("non-orthogonal" )  analysis  of  variance 
and  covariance  for  crossed  classifications  with  incomplete  and  unbalanced 
data.  The  method  of  NOVACCM  is  basically  a  backward  ranking  procedure 
applied  to  individual  and/or  groups  of  independent  variables  (concomitant 
independent  variables  and/or  ANOVA  effects,  respectively).  The  result  of 
the  ranking  is  a  model  ("significant  model")  which  contains  only  significant 
concomitant  independent  variables  and/or  ANOVA  effects.  The  method  and 
use  of  the  program  is  illustrated  by  examples  of  the  statistical  analysis 
of  bodies  of  incomplete  experimental  data. 
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FOREWORD 


The  work  covered  by  this  report  was  done  in  the  Mathematical 
Statistics  Branch  of  the  Operations  Research  Division,  Computation  and 
Analysis  Laboratory,  under  Foundational  Research  Project  No.  29Y, 
"Computer  Programs  for  Statistical  Analyses." 

The  report  contains  the  description  of  the  method  and  use  of  the 
computer  program  NOVACCM,  which  performs  analysis  of  variance  and 
covariance  for  unbalanced  data  classifications  with  missing  values, 
i.e.,  for  situations  which  are  often  met  in  the  analysis  of  Hav&l 
ordnance  experimentation  and  teat  data. 

NOVACOM  was  coded  from  notes  (similar  in  content  to  some  parts  of 
the  present  report)  by  Mr.  T.  Herring  of  the  Programming  Division, 
Computation  and  Analysis  Laboratory.  Mr.  Herring,  who  contributed 
significantly  also  to  the  general  methodological  concept  of  NOVACCM, 
is  the  author  of  the  program  documentation  of  NOVACCM.  ("A  Programming 
Guide  to  NOVACCM",  NWL  Technical  Memorandum,  in  preparation.) 

Many  ideas  for  the  concept  of  NOVACCM  were  contributed  by  Messrs. 

C.  Bates,  G.  Gemmill  and  R.  Shade  of  the  Mathematical  Statistics  Branch. 
Mr.  A.  R.  DiDonato  and  Dr.  M.  P.  Jarnagin  of  the  Mathematics  Research 
Group,  Computation  and  Analysis  Laboratory,  developed  the  method  of 
the  subroutine  ISUBX  for  the  computation  of  the  incomplete  beta  function 
ratio  contained  in  NOVACCM.  This  method  is  documented  in  NWL  Report 
No.  19V',  revised  October  19ob. 

The  author  wishes  to  thank  Dr-  Sidney  Addelman  and  Mr.  James  Merrill 
of  the  Research  Triangle  Institute,  Durham,  North  Carolina,  for  their 
valuable  comments  on  the  interpretation  of  the  NOVACCM  results. 

The  report  was  typed  by  Miss  Judy  D.  Merryman. 

The  work  on  this  report  was  completed  on  23  July  . 

APPROVED  FOR  RELEASE: 

^  <  •* 

BERNARD  SMITH 

Technical  Director 
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The  concept  of  the  program  NOVACCM  ("Non-Orthogonal  VAriance  and 
COvariance  analysis  by  Multiple  Regression  techniques"),  as  described 
in  this  report,  is  based  on  the  multiple  regression  approach  to  analysis 
of  variance  (see,  for  example,  Brownlee  [I960]).  The  underlying  method 
of  NOVACCM  was  developed  in  such  a  manner  that  a  wide  variety  of 
problems  in  the  area  of  analyzing  data  based  on  the  general  linear 
statistical  model  can  be  solved.  The  possible  applications  of  NOVACCM 
in  this  area  are:  multiple  linear  regression  including  polynomial 
regression,  ("orthogonal")  analysis  of  variance  and  covariance  for 
'M-ossed  class if lent ions  with  balanced  and  complete  or  Incomplete  data, 
und  ("non-orthogonal")  analysis  of  variance  and  covariance  for  crossed 
classifications  with  incomplete  and  unbalanced  data  when  possibly  same 
of  the  ANOVA  effects  are  confounded. 

While  regression  and  "orthogonal"  ANOVA  may  be  considered  as  bonus 
areas  of  application,  it  was  the  third  area  (of  "non-orthogonal"  analysis 
of  variance  and  covariance  for  crossed  classifications)  '’or  which  NOVACCM 
waB  developed.  Most  of  the  theory  for  the  method  of  the  program  is 
described  in  the  present  report.  A  more  detailed  outline  of  the  theory 
is  contained  in  another  paper  by  the  author  (Abt  [1967]). 

The  method  of  NOVACOM  is  basically  a  backward  ranking  procedure 
appliea  to  individual  independent  variables  and/or  groups  of  independent 
variables,  where  the  groups  represent  analysis  of  variance  effects  in 
the  general  linear  model.  The  ranking  is  done  by  order  of  prediction 
power  for  the  dependent  variable  (response  variable),  where  the  so-called 
"non-significance"  serves  as  criterion  for  establishing  the  ranking.  In 
ranking  the  independent  variables  of  analysis  of  covariance  models,  the 
program  makes  an  internal  deciaion  (based  upon  a  significance  level  o 
specified  by  the  user)  whether  to  include  the  covariates  for  the  analysis 
of  variance  part  of  the  ranking.  There  ia  some  restriction  in  the  ranking 
of  ANOVA  effects  in  that  at  a  given  step  of  the  ranking  only  those  effects 
are  admissible  for  ranking  whoae  associated  sums  of  squares  are  independent 
from  the  restrictions  chosen  to  make  the  linear  model  a  model  of  full  rank. 
(The  adaiaaibility  is  internally  determined  by  the  program.)  The  ranking 
procedure  leads  to  a  significant  model  which  contains  those  effects  (and 
covariates,  if  any)  which  are  significant  at  a  level  specified  by  the 
user,  plus  those  effects,  if  any,  which  did  not  become  admissible  in  the 
ranking  procedure.  Accordingly,  NOVACOM  may  be  considered  as  a  screening 
tool  for  significant  factorial  effects  in  (crossed)  data  classifications 
with  possibly  highly  incomplete  and  unbalanced  data.  A  special  additional 
feature  allows  for  the  screening  for  "the  most  probable  significant  model" 
when  there  are  confounded  ANOVA  effects. 

The  model  in  NOVACCM  may  include  up  to  159  independent  variables. 

The  limitation  on  the  number  of  covariates  is  determined  by  the  number 
of  independent  variables  representing  ANOVA  effects.  The  factors  may  have 
qualitative  and/or  vu&ntitatlve  levels.  In  one  given  problem,  up  to  four 
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different  dependent  variables  may  be  analyzed,  however,  each  one  in  a 
unj.varlate  manner  and  for  the  same  values  of  the  Independent  variables. 

Some  parts  of  the  method  of  NCVACCM  were  taken  from  the  program 
DA-MRCA  for  multiple  linear  regression  (Abt  et  al.  [1966]).  Accordingly, 
the  reader  is  often  referred  to  this  documentation  of  DA-MRCA  which  Is 
listed  as  "Reference  2"  in  Section  1*  of  the  present  report.  While  the 
user  of  DA-MRCA  was  also  able  to  perform,  though  in  a  somewhat  cumber some 
way,  "non- orthogonal"  analysis  of  variance  (and  covariance),  he  had  no 
possibility  of  arriving  at  a  significant  model  by  ranking  methods,  and 
he  had  to  do  the  generation  of  the  model  and  the  design  matrix  mostly 
by  hand-input.  In  the  method  of  NWAC(M,  considerable  emphasis  is  put 
on  the  automatic  generation  of  the  model  and  the  design  matrix. 

Because  of  considerations  regarding  its  size,  the  present  report 
does  not  contain  any  documentation  of  the  programing  of  NOVACCM.  The 
FORTRAN  IV  documentation,  as  well  as  general  programming  notes  are 
contained  In  a  report  by  T.  Herring  [1967]  who  also  programmed  NOVACCM. 

In  accordance  with  its  title,  the  present  report  is  restricted  to  the 
description  of  the  method  and  use  of  NOVACCM.  In  addition,  the  report 
contains  a  number  of  numerical  illustrations  of  the  program's  possible 
applications.  Therefore,  the  report  may  serve  as  a  manual  for  the  user 
of  the  program,  and  for  this  purpose,  the  report  also  contains  all  the 
necessary  information  for  operating  the  program  and  for  interpreting 
the  results.  This  necessitated  same  overlap  with  the  contents  of  the 
afore-mentioned  report  by  T.  Herring;  for  example,  both  reports  contain 
the  description  of  the  control  and  data  cards.  The  two  reports,  each 
being  self-contained,  can  be  defined  as  the  statistician's  guide  (the 
present  report)  and  the  programmer's  guide  to  NOVACCM  (the  report  by 
T.  Herring). 


2.  METHOD  CF  NOVACCM 


2.1  General  Outline  of  the  Method 


2.1.1  The  Model 


The  method  of  NOVACCM  Is  based  on  the  general  linear 
statistical  model, 


N 

y  -  ffo  +  E  0vXy  +  e,  (0-1) 

v-1 


where , 

y  ■  "dependent"  (random)  variable 

Xy  ■  "independent"  (non-random)  variables,  v  •  1,...,N 

0V  -  regression  coefficients,  v  «  1,...,N 

0o  a  general  constant 

e  =  "residual",  or  "error"  term:  a  random  variable  with  expectation 
zero  and  variance  o3  ,  usually  assumed  to  be  independently 
normally  distributed. 

More  specifically,  the  model  (2-1)  ir  NOVACCM  is  of  the  form 
N-T  N 

y  ■  Po  +  E  0v«v  +  E  0vXy  +  e,  (2-2) 

v-1  v-N-T+1 

where  the  first  N-T  independent  variables  represent  analysis  of  variance 
effects  and  the  last  T  independent  variables  represent  concomitant  variables 
("covariates"  if  O  <  t  <  N).  Any  one  independent  variable  ("IV" 1  of  the 
first  N-T  will  be  referred  to  as  a  "Design  Independent  Variable"  ("DIV"), 
smd  any  one  IV  of  the  last  T  will  be  referred  to  as  a  "Concomitant  Independent 
Variable"  ("CIV").  Consequently,  with  T=N  CIVs,  the  model  (2-2)  is~that 
of  multiple  linear  regression;  with  0  <  T  <  N  CIVs,  model  (2-2)  is  that  of 
analysis  of  covariance;  and  with  T-0  it  is  the  model  of  analysis  of  variance. 

There  are  two  types  of  CIVs:  (1)  those  representing  the 
original,  physically  observed  variables  (in  other  words,  the  linear  terms), 
referred  to  as  "Original  CIVs"  ("OCIVs")  and  (2)  those  CIVs  representing 
polynomial  tenns7  generated  fran  the  OCIVs,  referred  to  as  "Generated 
CIVs"  ("OCIVs").  Therefore,  T  -  (number  of  OCIVs}  +  (number”of  OCIVs] . 


The  N-T  DlVs  (of  the  analysis  of  variance  part  of  the 
model)  require  a  Boro  extenaive  diacuaaion. 

Since  the  application  of  NOVACCH  ia  United  to  crossed 
classification  models,  all  AN  OVA  effects  nay  be  referred  to  as  "factorial" 
effects  (main  effects,  two-factor  interaction!,  three-factor  interactions, 
etc.)'  From  the  ract  that  the  factorial  effecta  are  represented  by  groups 
of  Independent  variables  (DIVs)  in  the  general  linear  statistical  model 
(<2-y),  all  factors  in  the  analysis  of  variance  oust  be  considered  as 
"fixed  effects"  factors.  That  la,  "random  effecta"  factors  (the  levels 
of  which  are  randomly  sampled  from  finite  or  infinite  populations  of 
levels)  cannot  be  treated  as  such  by  the  program  NOVACCM . 

First  consider  a  model  with  only  "qualitative"  factors, 
l.e.  ,  factors  whose  levels  correspond  to  qualitatively  specified  categories, 
such  as  "types  of  material",  or  "manufacturer#".  A a  an  example,  take  the 
conventional  ANOVA  model  for  a  two-way  crossed  classification  with 
interaction  (the  two  ways  of  the  data  classification  corresponding  to 
two  factors,  say  "c?'  and  "5"): 


yaep  -  ■  +  «Q  +  +  ♦  «uBp 


(2-5) 


with 

o  -  l,...,Ha8 

a  «  1, . . .  ,A 

B  - 


Here  is  the  number  of  observations  (y)  for  the  level 
combination,  or  cell,  (<*,0)  of  the  two  factors  Cf  suid  8;  and  A  and  B  arc 
the  numbers  of  levels  of  the  two  factors,  respectively.  (For  ease  of 
the  following  discussion  the  assumption  will  be  made  that  >  0  for  all 

cells.  This  assumption,  however,  is  not  essential  and  all  features  of 
the  "qualitative"  model  presently  being  discussed  hold  also  for  esses 
with  empty  cells,  that  is,  for  cases  with  some  but  not  all  «  0.) 

Also  in  (2-3),  the  model  constants  (parameters)  tn,  Sg,  bg,  and  at^,8 
represent,  respectively,  a  general  constant,  the  effect  of  level  a  of 
factor  <J,  the  effect  of  level  |  of  factor  8,  and  the  interaction  effect 
of  levels  a  and  (3  in  cell  (a,B)«  The  error  term  ^g.  is  ajsumed  to  be 
normally,  independently  distributed  with  expectation  zero  and  variance 
oa  . 


In  order  to  treat  the  model  (2-5)  as  a  linear  hypothesis 
model  of  full  rank,  the  parameters  Sg.bg,  and  atfeg  must  be  subjected  to 
linear  restrictions  such  that  the  total  number  of  degrees  of  freedom  for 
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factorial  effects  la  AB-1  •  The  restrictions  apply  identically  to  the 
estimate*  of  the  paraaetera  resulting  from  the  solution  of  the  system 
of  the  AB-1  normal  equations.  The  set  of  restrictions  used  by  Grayblll 
[1961]  Is  a  very  convenient  choice  with  respect  to  ccejputational  simplicity 
as  will  be  shown  later: 


■  ^ 

-  0 

*bA» 

-  0 

for  8  -  1, . . .  ,B 

> 

(2-4) 

-  0 

for  0  *  l, . . .  ,A  ^ 

This  set  of  restrictions  allows  the  remt'ning  AB-1  psu-ameters  to  be 
considered  as  contrasts  with  respect  to  the  last  levels  of  the  factors. 
(For  example,  ax  »  ai  -  a,  . ) 

In  order  to  further  discuss  the  model  (2-3),  it  is 
convenient  to  numerically  apeclfy  A  and  B,  the  numbers  of  levels  for 
factors  c7  and  B,  respectively.  Let  A»B-3,  for  example-  Then  there  will 
be  AB-1-8  parameters  representing  factorial  effects  in  the  example  model. 

In  the  multiple  regression  approach  to  analysis  of  variance 
(see,  for  example*  Brownlee  [i960])  each  one  of  these  psuametera  is 
considered  ss  a  "regression  coefficient"  of  an  auxiliary  Independent 
variable  which  takes  on  the  value  1  when  the  respective  cfrect  is  present 
ana  the  value  0  when  the  effect  is  not  present.  (The  auxiliary  IVs  will 
be  given  the  symbols  u* . )  In  this  approach,  the  general  constant  m  is 
considered  as  the  regression  coefficient  or  a  duraay  IV,  u0,  which  has  the 
constant  value  1. 

Applying  the  Qrayblll  restrictions  (2-4)  and  introducing 
the  variables  u*  into  the  model  (2-3)  with  A-B-3,  one  can  see  that  the 
Uy  (except  for  Uo)  represent  the  N-T-N-6  "Diva"  for  qualitative  factorial 
effects  in  the  general  model  (2-2): 


yiagp  ■  bftk:  +  *1^1  +  agu^j  ♦  bj.ua  +  bj;U4 


+  abjjus  +  abjaUa  +  abgju,  +abe2Ue  +  e^p. 

For  cell  (or, 8)  -  (1,2),  for  example,  the  model  (2-3)  becomes: 


(2-5) 


V  (2-6) 


y  i2p 


a*l  +  Hi-l  +  a2-0  +  bi*0  +  b£*l 

+  abu’O  +  j.  +  s.bgj/0  +  abgg’O  +  e^gp 

in  +  aj.  +  b2  +  ab12  +  e^gp  * 


As  can  be  seen,  the  values  of  the  DIVs  for  Interaction  constants  are  the 
products  of  the  values  of  the  DIVs  for  the  corresponding  main  effect 
constants.  For  example,  u^  =  1  =  UjU*  =  1*1;  and  Ug  =  0  =  ugu*  =  0-1. 
This  "product  rule"  (when  using  restrictions  of  the  type  (2-4))  applies 
generally  to  all  crossed  classification  models  with  qualitative  factors. 
For  example ,  in  a  three-way  classification  where  factors  (J,  8,  and  C, 
have  A=2,  B=3,  and  C=3  levels,  respectively,  the  numerical  values  of 
the  PIV  attached  to  the  interaction  constant  abc121  !•  1*1* 1=1  for  the 
R121  observations  in  cell  (1,2,1)  and  it  is  0  for  the  observations  of 
all  other  cells.  The  product  rule  allows  a  single  generation  of  the 
matrix  of  the  coefficients  of  the  normal  equations.  (See  Section  2.2.) 

Next,  consider  a  model  with  only  "quantitative"  factors, 
i.e.,  with  factors  whose  levels  are  specified  by  numerical  values  of 
continuous  variables,  such  as  "temperature",  or  "pressure."  If  the 
two-way  crossed  classification  is  again  taken  as  an  example,  the  model 
for  this  case  can  be  written: 


YaSp  =  muo  +  I  B<u)^a  +  E  B(tv)X£0  +  \  Ve<u;v)3$*aX*e  +  e*B  .  (2-7) 

U=1  v=l  ^=1  v-1 

Here,  the  =  \(a)  and  XbB  =  ^(0)  are  the  numerical  values  of  the 
continuous  variables  X,  and  X,,  which  specify  the  levels  of  the  quantitative 
factors  a  and  8,  respectively.  (Accordii^jly,  \  and  X.  will  be  called 
"quantitative  factor  variables.")  That  is,  model  (2-7,  is  that  of  poly¬ 
nomial  regression  in  the  usual  sense,  and  the  AB-1  parameters  B*u),  Bbv), 
and  BJVV)  are  the  regression  coefficients  in  the  usual  sense.  ^NOTE.  In 
"orthogonal"  analysis  of  variance  one  would  write  the  model,  for  this  case 
of  all  factors  being  quantitative,  in  the  conventional  way  (2-3)  rather 
than  in  the  way  of  (2-7).  In  the  course  of  the  analysis,  one  would 
decompose  the  ANCfVA  effects  into  orthogonal  contrasts,  for  example,  for 
factor  <2,  into  the  linear  contrast,  the  quadratic  contrast,  .  .  . ,  the 
contrast  of  (A-l)th  order.  Since  in  "non- orthogonal"  AN0VA  orthogonal 
contrasts  dc  not  exist,  the  form  (2-7)  of  the  model  is  used  here.) 

As  already  implied  in  the  form  of  the  model  (2-7),  also 
in  this  cr.se  the  DIVs  representing  interactions  can  be  generated  from 
the  DIVs  of  the  corresponding  main  effects  by  multiplication.  For 
example,  the  DIV  representing  the  £7,  ,  „ ...  xS  ..  interaction, 

qutariMO 
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i.e.,  is  the  product  of  the  DIVs  representing  the  linear  effect  of 

d  and  the  quadratic  effect  of  8 . 

Finally,  consider  an  ANOVA  model  with  both  qualitative  and  • 
quantitative  factors.  As  an  example  again  take  that  of  the  two-way 
classification,  where  factor  d  is  quantitative,  say,  and  where  factor  8 
is  qualitative.  The  combination  of  the  two  types  of  models  (2-5)  and 
(2-7)  leads  to  the  model  of  the  present  case: 

A-l  B-l  A-lf  B-l  n 

ya0o  =  ou0  +  s;  +  E  bvUv  +  E  {  2  +  futp  (2-8) 

^i=l  v=l 

where 

« 

<a  =  <(»)  = 

and  (p  =  level  number  of  factor  Q\  )  y  (2-9) 

{1  if  P  =  V 
0  if  9  f  v 

For  example,  with  A=B=3,  the  Wd  represent  the  interaction  terms  in  the 
following  manner:  W*  and  We  are  the  DIVs  of  the  interaction  effect 

la.tr  x  and  W1  and  WI  are  the  DIVs  of  tfquadr.ti*  x 

Summarizing,  there  are  three  types  of  DIVs  in  the  analysis 
of  variance  part  of  the  NOVACCM  model,  where  each  DIV  represents  an 
individual  degree  of  freedom  of  an  ANOVA  effect:  (l)  the  and  their 
products  representing  individual  degrees  of  freedom  of  qualitative 
factorial  effects,  (2)  the  Xj^,  X^J,  ...  and  their  products,  representing 
quantitative  factorial  effects,  and  (3)  the  V^,  representing  individual 
degrees  of  freedom  of  interaction  effects  between  qualitative  and 
quantitative  factors.  All  factorial  effects  involving  at  least  one 
qualitative  factor  with  more  than  two  levels  are  represented  by  groups 
of  2  or  more  DIVs.  These  are  the  groups  of  DIVs  which  are  subjected 
to  the  backward  ranking  procedure  as  will  be  discussed  in  Section  2.1.2. 

When  the  given  data  layout  contains  unoccupied  cells,  or 
"empty"  cells,  it  is  not  always  possible  to  fit  the  constants  of  the 
ANOVA  part  of  the  model  in  a  unique  way.  In  other  words,  in  case  of 
empty  cells  some  of  the  factorial  effects  may  be  confounded.  In  Appendix 
A  a  method  is  described  for  fitting  the  constants  when  the  model  is  to 
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contain  interaction  effects  and  when  there  are  empty  cells.  The  method 
includes  rules  for  fitting  the  constants  of  the  various  possible  models  * 
in  case  of  the  presence  of  confounded  effects. 

In  the  case  of  confounded  effects,  the  user  of  NOVACCM 
has  the  possibility  to  analyze  all  possible  models,  and  to  search  for 
the  "most  probable"  significant  model.  Each  one  of  the  possible  models 
is  treated  as  a  separate  problem  by  NOVACCM  since  the  design  matrix  and 
the  matrix  of  the  normal  equations  (summation  matrix)  must  necessarily 
be  different  for  each  model,  or  set  of  constants  fitted.  The  various 
problems  corresponding  to  the  various  possible  models  are,  therefore, 
referred  to  as  "Set  No.  v",  v  =  1,2,3, ••••  Since  the  input  of  these 
different  sets  of  constants  is  done  via  Control  Card  No.  4  of  the 
program,  the  sets  are  also  referred  to  as  "Control  Card  4  Set  No.  v", 
v  =  1,2,3,....  For  the  use  of  the  results  from  the  various  sets  see 
Sections  3.1*3  and  3*3 .2. 

For  the  case  of  an  analysis  of  covariance  model,  NOVACCM 
provides  another  option  to  change  the  model.  The  author  has  shown 
(Abt  [I960],  pp.  102/103)  for  the  case  of  one  covariate  (T=«l  in  the 
present  notation),  in  which  way  the  analysis  of  covariance  results  are 
related  to  those  of  the  analysis  of  variance  (excluding  the  covarlate) 
when  the  factors  and  their  interactions  exercise  significant  effects 
upon  the  cov&riate.  (The  covariate  then,  naturally,  does  not  fulfill 
the  condition  of  being  a  "fixed  variate";  but  one  has  to  face  this 
situation  which  often  occurs  in  practice.)  In  fsct,  in  analysis  of 
covariance,  if  the  factors  have  significant  effects  upon  the  covariate, 
the  significance  of  the  factorial  ei  cts,  with  respect  to  the  dependent 
variable,  may  be  considerably  reduce^  when  coopered  to  the  case  where  the 
covariate  is  not  included  in  the  model.  (See  Example  6  in  Section  3-4.6.) 
A  corresponding  situation  exists,  naturally,  when  there  iB  more  than  one 
covariate  in  the  model.  In  other  words,  if  significant  covariates  are 
kept  in  the  model,  the  arxilyst  cannot  be  sure  that  the  true  significance 
of  the  factorial  effects  is  shown  in  the  results  of  the  analysis  of 
covariance. 


In  order  to  give  the  uBer  a  possibility  to  Judge  the 
significance  of  the  factorial  effects  without  having  the  significant 
covariates  in  the  model,  NCVACOM  will  optionally  run  an  analysis  of 
variance  for  the  factorial  effects  part  of  the  model  alcne,  i.e.,  without 
sill  covariates.  Also  under  chis  option,  additional  analyses  of  variance 
are  run  for  all  OCIVs  which  were  significant,  i.e.,  the  significant  OCIVs 
take  the  place  of  the  dependent  variable  in  these  analyses  to  study  the 
influence  of  the  factors  and  their  interactions  upon  the  covariates  which 
turned  out,  to  be  significant  in  the  analysis  of  covariance  model.  These 
additional  analyses  of  variance  are  identified  as  "ANYAs"  in  the  program 
and  will  be  referred  to  by  this  name  in  the  remainder  of  the  present 
report.  As  to  the  use  of  the  ANVAs  see  Section  3-3 >3* 
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2.1.2  The  Backward  Ranking  Method 

As  stated  In  the  Introduction,  the  program  NOVACCM  is 
mainly  intended  as  a  tool  to  screen,  for  significant  factorial  effects, 
analysis  of  variance  {or  covariance)  models  for  crossed  classifications 
with  incomplete  and  unbalanced  data.  The  method  applied  to  this  end  in 
NOVACCM  is  the  "backward  ranking  method"  discussed  in  Abt  [1967].  By 
this  method  the  individual  and/or  groups  of  independent  variables  of  the 
model  (2-2)  are  ranked  in  an  ascending  order  of  importance. 

Speaking,  for  the  moment,  of  an  analysis  of  variance  model 
only,  i.e.,  of  a  model  (2-2)  without  covariates  (T=0),  the  ranking  is 
done  as  follows:  At  the  first  step,  that  ANOVA  effect  (the  group  of 
DIVs)  is  deleted  from  the  model  which  among  all  effects  "admissible  for 
ranking"  (to  be  defined  later)  has  the  smallest  prediction  power  for  y 
as  measured  by  its  "non-significance"  (also  to  be  defined  later);  at 
the  second  step,  those  two  ANOVA  effects  are  deleted  from  the  model 
which  together  have  the  smallest  prediction  power  for  y,  where  one  of 
the  two  effects  is  the  one  ranked  least  important  at  the  first  step  and 
where  the  second  effect  is  an  effect  "admissible  for  ranking"  at  this 
second  step;  and  so  forth  until  all  ANOVA  effects  are  ranked.  This 
method  leads  to  a  unique  ranking  by  importance  of  all  ANOVA  effects  and 
enables  the  user  of  the  program  to  define,  at  a  prechosen  significance 
level  a,  a  "significant  model." 

When  the  general  model  of  analysis  of  covariance  (T  >  0) 
is  again  assumed,  the  T  covariates  (CIVs)  are  ranked  in  a  manner  corresponding 
to  that  described  before  for  the  groups  of  DIVs,  however,  in  this  case  one 
independent  variable  is  deleted  fran  the  model  at  each  step.  The  ranking 
of  the  CIVs  is  done  first,  i.e.,  all  original  IJ-T  DIVs  are  kept  as  part 
of  the  model  while  the  T  CIVs  are  being  ranked. 

The  ranking  process  of  the  CIVs  is  abbreviated  as  "COMO"  - 
"Concomitant  variables  Magnitude  (of  prediction  power  for  y)  Ordering", 
and  the  subroutine  which  performs  CCMO  in  the  program  is  identically 
named.  Correspondingly,  the  ranking  process  of  the  groups  of  DIVs  is 
abbreviated  as  "FEMO"  =  "Factorial  Effects  Magnitude  (of  prediction 
power  for  y)  Ordering",  and  the  subroutine  performing  FEMO  in  NOVACCM 
is  again  identically  named.  The  names  CCMO  and  FEMO  are  also  used,  in 
a  more  general  meaning,  to  refer  to  the  whole  analysis  of  covariance 
part  and  to  the  whole  analysis  of  variance  part,  respectively,  of  the 
program. 


As  can  be  seen  from  the  above,  the  ranking  is  perform'.-  i 
cumulatively,  that  is,  at  each  step  all  individual  and/or  groups  of 
independent  variables  ranked  at  previous  steps  are  included  in  the  group 
of  independent  variables  sought  at  the  present  step  to  hav-  minimum 


! 
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prediction  power  for  y.  (The  reason  for  ranking  "cunrul  tively"  is  explained 
further  below.  )  The  cumulative  ranking  principle  is  based  on  what  may  be 
termed  the  "main  theorem  of  multiple  regression."  The  content  of  the 
theorem,  see,  for  example,  Anderson  and  Bancroft  [1952],  p.  172,  i3  as 
follows : 

MAIN  THEOREM.  Given  the  linear  model  (2-1) 


N 

y  =  $o  +  E  Pv*v  +  e> 

V=1 


the  residuals,  e,  are  assumed  to  be  normally  independently  distributed 
with  expectation  zero  and  variance  o3  •  Under  the  null  hypothesis 
Ho[Pv,  =%='■•■  &vN_„.  =  °L  where  *  1  •>  8vn_n.  ) 

are  trie  regression  coefficients  of  a  specified  set  of  N-N'  independent 
variables  whose  contribution  to  the  "total"  regression  sum  of  squares 
(due  to  all  N  independent  variables)  is  to  be  tested,  the  variance 
ratio 


p  S3.-- /ATSS  -  ASSR(M)  (2.10 

*  »-»>/  n-H-1 

is  distributed  as  F  with  N-N*  and  n-N-1  degrees  of  freedom.  The 
terms  in  this  formula  are  defined  as  follows: 

ASSH(H)  =  "total"  regression  sum  of  squares  adjusted  for  the  mean, 
with  N  degrees  of  freedom,  due  to  all  N  independent 
variables ; 

SSN-Ni  =  ASSR(N)  -  AESR(N')  =  "additional"  regression  sum  of 
squares,  with  N-N'  degrees  of  freedom,  due  to  the 
specified  set  of  N-N'  independent  variables,  where 
ASSR(N')  is  as  defined  below; 

=  regression  sum  of  squares  adjusted  for  the  mean,  with  N* 
degrees  of  freedom,  due  to  the  N'  <  N  independent  variables 
left  in  the  model  after  deleting  the  N-N'  independent 
variables  whose  contribution  to  the  fit  is  to  be  tested; 
n 

=  I  (y,  -y)a  =  total  sum  of  squares  (of  y)  adjusted  for  the 
i=l  mean,  with  n-1  degrees  of  freedom; 

n  =  total  number  of  observed  y- values. 


ASSR(U') 


ATSS 


) 


In  the  terms  of  the  NOVACCM  model,  the  "specified  set  of 
N-N'  independent  variables"  equals  the  sum  of  (l)  the  CIV  or  group  of 
DIVs  to  be  considered  for  ranking  at  a  given  step  and  (2)  all  previously 
ranked  IVs.  The  prediction  power  for  y  of  this  set  may  be  tested  by 
F#  of  (2-10),  that  is,  the  null  hypothesis  that  the  set  of  N-N'  IVs  do 
not  have  any  prediction  power  for  y  may  be  tested.  Obviously,  the  more 
significant  Fe  is,  the  more  important  is  the  corresponding  group  of 
independent  variables  for  y,  and  vice  versa.  This  leads  to  the  ranking 
criterion  "Non-Significance"  as  used  in  NCVACCM  for  the  actual  ranking: 

m 

Non-Significance  =  J*  cp(F)dF  (2-ll) 


where  <p(F)  is  the  probability  density  function  of  F  with  N-N'  and  n-N-1 
degrees  of  freedom.  The  non-significance  is  the  tail  area  under  the 
density  curve  of  F  to  the  light  of  the  calculated  value  F0 .  One  can 
easily  see  from  (2-11)  that  the  non-significance  equals,  the  significance 
level  a  in  the  test  (2-10)  when  F,  equals  Fx_a. 

The  importance  of  the  non-significance  as  a  ranking 
criterion  lies  in  the  fact  that,  in  comparing  various  sets  of  independent 
variables,  the  possibly  varying  degrees  of  freedom,  N-N',  of  the  sets  are 
taken  into  account.  The  term  non-slgnif icance  is  derived  from  the  fact 
that  a  set  of  IVs  with  a  non-significance  which  is  larger  than  that  of 
another  set  of  IVs  can  be  considered  as  having  a  prediction  power  for  y 
which  is  smaller  than  that  of  the  second  set. 

At  each  3tep  of  the  analysis  of  variance  part  of  NOVACOM, 
the  non-significances  are  computed,  for  each  admissible  effect  at  that 
step,  with  the  incomplete  beta  function  ri 
and  Hartley  [1962],  p.  182: 

Non-Significance  «=  J1  cp(F)dF  = 

Fe 

where 


fi  = 
f2  = 


atio,  see,  for  example,  Greenwood 


I*e 

o*§Fcr' 

N-N' 

n-N-1 


(2-12) 
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In  the  text  of  the  present  report#  the  computed  value  of  the  non-aignifican 
will  bo  referred  to  as  "I(X)."  The  subroutine  included  in  NOVACCM  for  thin 
computation  is  called  ISUM  and  is  based  on  a  method  by  DiDonato  and 
Jarnagin  [ 1966] . 

One  can  see  that  in  the  analyses  of  covariance  part  of 
NOVACCM,  i.e.,  in  COMO,  the  ranking  criterion  "non-significance"  is 
equivalent  to  that  of  the  "additonal  regression  sum  of  squares",  SSM_N», 
of  the  Main  Theorem,  (2-10),  because  at  each  step  the  degree  of  freedom 
fi  is  constant  for  all  CIVs  to  be  considered  for  ranking.  Consequently, 
the  CIVs  are  ranked  in  the  program  according  to  8S*_N.  .  However,  once 
the  least  important  CIV  at  a  given  step  has  been  found  according  to  the 
SSN_N«  ,  the  l(X)-value  (2-12)  for  the  c orre sponding  group  of 
N-N'  CIVs  is  computed  in  order  to  provide  information  for  the  determinatioi 
of  the  significant  CIVs. 

The  program  defines  the  significant  CIVs,  which  ure  to  be 
kept  in  the  model  during  the  later  ranking  of  the  factorial  effects,  as 
follows.  From  the  established  ranking  order  in  COMO  (which  is  achieved 
while  keeping  all  original  N-T  DIVs  in  the  model)  and  the  l(X)-valueu  the 
program  looks  for  the  "first  significant  step",  i.e.,  the  step  where 
I(X)  so  for  the  first  time.  (This  value  a,  which  is  specified  from 
one  of  thiee  »-values  chosen  by  the  program  user,  1b  defined  as  V-value 
No.  KALPHA';  where  KALPHA  *  1,2,  or  3,  is  also  the  choice  of  the  user.) 

All  CIVs  ranked  before  the  step  where  I(X)  s  ALPHA  (KALPHA)  for  the  first 
time  will  be  deleted  permanently  from  the  model,  whereas  the  others 
(i.e.,  the  significant  CIVs)  will  remain  part  of  the  model  throughout 
FEMO. 


The  ranking  of  the  covariates  and  the  factorial  effects 
in  the  manner  described  above  leads  to  a  uniquely  defined  cr**ho^cnal 
decomposition  of  ASSR(N),  the  "total  regression'sum'of  squares"?  into 
the  successive  "additional  regreBBicn  sums  of  squares."  This  is  the 
main  advantage  of  the  ranking  method  compared  to  other  methods  of 
applying  analysis  of  variance  to  incomplete  and  unbalanced  data  layouts. 
If  all  N-T  degrees  of  freedom  available  in  such  a  layout  are  properly 
ascribed  to  factorial  effects  (see  also  Appendix  A),  the  regression  sum 
of  squares,  ASSR(N-T),  due  to  all  N-T  DIVs  with  which  FEMO  started,  has 
degrees  of  freedom  equal  to  the  number  of  occupied  cells  in  the  layout 
minus  one.  However,  it  is  not  always  desirable  (or  possible,  due  to 
program  limitations)  to  ascribe  all  degrees  of  freedom  "between  cells'' 
to  factorial  effects.  In  any  case  (whether  or  not  ASSR(N-T)  equals  the 
sum  of  squares  between  cells),  the  ranking  method  should  and  will  tend 
to  ascribe  a  maximum  portion  of  the  regression  sum  of  squares  to  a" 

°f  factorial  effects.  Correspondingly,  a  maximum  portion 
oi  Ai>SR(NJ  is  ascribed  to  a  minimum  number  of  covariates  and  factorial 
effects,  or,  in  multiple  regression,  to  a  minimum  number  ofTnde pendant 
variables. 
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The  backward  direction  of  the  ranking  (as  opposed  to  a 
forward  direction)  la  necessitated  by  the  fact  that  only  in  this  way 
are  the  unities  of  so-called  "compounds’'  preserved  during  the  ranking 
process,  see  Abt  [1967].  A  "compound"  is  defined  as  a  set  of  N  («  N) 

IVs  when  the  error  variance  a3  associated  with  all  T5  IVs  is  smaller, 
by  orders  of  magnitude,  than  the  error  variance  associated  with  any 
subset  of  N-l  IVs,  i.e.,  after  any  single  IV  has  been  excluded  from 
the  set  of  the  N  IVs  comprising  the  compound. 

Note.  The  reason  for  ranking  the  CIVs  and/or  groups  of 
DIVs  cumulatively  on  the  basis  of  Fe  in  (2-10)  is  in  order  to  be  able  to 
maintain  a  valid  ranking  criterion  through  the  significant  model-  The 
alternative  to  the  cumulative  ranking  procedure  would  be  to  include  in 
the  numerator  of  a  given  Fe -value  (2-10)  the  additional  regression  sum 
of  squares  due  to  only  one  given  effect  considered  for  ranking.  For 
example,  at  the  second  step  of  the  ranking,  if  a  given  effect  is  repre¬ 
sented  by  N'-N"  DIVs  (where  N  >  N'  >  N")  one  wculd  UBe,  instead  of  (2-10), 


F. 

to  test  the  null  hypothesis  that  the  N'-N"  regression  coefficients  are 
all  zero.  However,  this  Fe -value  is  distributed  as  F  only  if  the  previous 
null  hypothesis,  Ho(8v  =  0V  =  •  •  •  =  0V  ,}  =  0  was  accepted.  That  is, 
a  ranking  order  based  in  thii  alternative  Jrftcedure  would  be  valid  only 
until  the  first  significant  effect  is  reached.  From  then  on,  that  is,  for 
all  effects  except  the  least  important  one  contained  in  the  significant 
model,  the  ranking  order  would  be  invalid.  Since,  in  the  method  of  NOVACCM, 
considerable  emphasis  is  placed  on  the  ranking  order  of  the  significant 
effects  also,  the  cumulative  dropping  procedure  based  on  the  Fc -value 
(2-10)  is  adopted  here.  It  is  felt  that  the  ranking  order  (for  non¬ 
significant  effects)  would  be  changed  little  -  if  at  all  -  if  the 
alternative  procedure  would  be  applied.  However,  the  significance  as  . 
given  by  l(X)  is  possibly  very  much  dependent,  upon  whether  the  eumulat ivo 
or  the  alternative  ranking  method  is  applied.  For  this  reason,  the  program 
gives  the  necessary  printout  to  provide  the  analyst  with  the  information 
to  determine  the  significance  of  the  F-test  at  any  given  step  according 
to  the  alternative  method.  See  Section  3.1. 3  for  a  more  detailed  discussion 

In  addition  to  the  cumulative  ranking  procedure,  the  pi,ogru,,i 
NOVACCM  has  an  option  to  perform  also  "single  deletion",  or  "single  dropping 
(of  CIVs  and/or  groups  of  DIVs  from  the  model).  However,  in  Ur;  procedure 
o  '  single  dropping  the  ranking  order  is  taken  from  the  results  of  the 
cumulative  procedure  without  re-employing  the  l(X)-criterion  or  sums  of 
squares  criterion  for  ranking.  The  single  dropping  procedure  esserit ially 
consists  of  a  redefinition  of  the  model  at  each  step,  i.e.,  of  a  pooling 
of  the  additional  regression  sum  of  squares  due  to  the  previously  ranked 
CIV  and/or  group  of  DIVs  with  the  previous  error  sum  of  squares  at  each 
step.  For  example,  at  the  second  step  of  single  dropping  in  FEMO,  the 


3SM..M..  f  ATSS  -  ASSR(N) 
N'-N"  /  n-N-1 


error  sun  of  squares  (which  was  ATS8  -  AS3R(N-T)  at  the  first  Btep)  is 
redefined  as  ATSS  -  ASSR(N-T)  ♦  SS*15,  where  SS'11  is  the  additional 
regression  sun  of  squares  due  to  the  group  of  DIVs  representing  the 
factorial  effect  which  was  ranked  (by  the  cumulative  dropping  procedure) 
as  least  important  at  the  first  step.  This  means  that,  for  the  second 
step,  the  model  is  redefined  as  containing  all  factorial  effects  of  the 
original  model  except  the  one  ranked  least  important  at  the  first  step. 

For  the  factorial  effect  which  was  ranked  second-least  important  at  the 
•**cond  step  of  the  cumulative  dropping  procedure,  the  l(X) -value  (2-12) 
is  then  computed  with  fi  =  degrees  of  freedom  of  effect  ranked  second- 
least  important,  and  with  fa  =  n-N+T-1  +  DF*  1 ' ,  where  DF*  1  *  are  the 
degrees  of  freedom  of  the  effect  ranked  least  important  at  the  first 
step.  At  the  third  step  of  single  dropping  in  FTMO,  the  above  degrees 
of  freedom  fx  and  f2  are  pooled  (as  are  the  corresponding  sums  of  squares) 
to  form  the  new  error  degrees  of  freedom  for  the  third  step;  and  so  forth. 

The  reason  for  computing  the  one  l(X)-v. luc,  as  indicated  above,  at  each 
step  of  single  dropping  is  to  provide  the  necessary  information  for  the 
determination  of  a  significant  modal  based  on  this  single  dropping 
procedure. 

The  reason  for  having  the  two  ranking  procedures  in  N  WACOM 
to  determine  a  significant  model  and  the  use  of  the  two  procedures  are 
discussed  in  Section  3.I.J. 

As  mentioned  previously,  at  each  step  of  the  cumulative 
ranking  procedure  only  the  "admissible"  effects  are  considered  for 
ranking  at  that  step.  The  concept  of  ranking  under  rules  of  "restricted 
admissibility"  (see  Abt  [1967])  Is  based  upon  the  fact  that  some  of  the 
additional  regression  sums  of  squares,  SSN_N«  of  (2-10),  which  correspond 
to  certain  null  hypotheses,  are  dependent  upon  the  type  of  linear 
restrictions  chosen  for  the  model  constants  ;  see  Scheffe' [19591  and 
Gosslee  and  Lucas  [1965].  Scheffe',  for  example,  has  shown  that  the 
additional  regression  sum  of  squares  due  to  any  one  of  the  two  main  effects, 
<7  and  B,  in  the  model  (2-3)  of  Section  2.1.1,  is  dependent  upon  the 
restrictions  chosen  for  the  constants  a^,  and  bg  (or  =  1,2,..,, A,  and 
0  ■  1,2,...,B)  as  long  as  the  constants  at^g  of  the  interaction  <5®  are 
contained  in  the  model  consisting  of  the  N '  IVs.  For  models  with  qualitative 
factors  only  the  following  can  be  shown  to  bo  generally  true.  The  additional 
regression  sum  of  squares,  ,  due  to  a  factorial  effect,  is  dependent 

upon  the  restrictions  chosen  for  the  constants  of  the  model  as  long  as  & 
higher  order  interaction  effect,  whose  symbol  contains  all  script  letters 
of  the  given  factorial  effect,  is  retained  in  the  remaining  model  of  the 
N*  IVs.  The  given  factorial  effect,  whose  symbol  consists  of  script 
letters  all  contained  in  the  symbol  of  the  higher  order  interaction  effect, 
will  be  called  a  "sub-effect"  of  that  higher  order  interaction  effect. 

For  example,  in  a  three-way  crossed  classification  with  qualitative  factors 
<7,  $,  and  C,  an  additional  regression  sum  of  squares,  SSH_N>,  containing  the 
effect  33  is  restriction-dependent  as  long  as  the  constants  of  the  three- 


factor  interaction  (353,  with  respect  to  which  <39  ie  a  nub-effect,  ore 
contained  in  the  model  of  the  N'  IVs.  Corresponding  restriction  dependence* 
can  be  shown  to  be  generally  true  for  cases  with  both  (jualitative  and 
quantitative  factors.  The  relation^  with  reapect  to  "sub-effects",  among 
the  factorial  effects  in  these  cases  will  not  be  explicitly  stated  here 
but  are  implied  in  the  admissibility  ruleB  given  further  below  and  in 
Section  2.2.2. 


Since,  in  general,  the  type  of  linear  restrictions  is 
arbitrarily  chosen,  it  is  logical  to  look  for  conditions  under  which  the 
additional  regression  sums  of  squares  ore  always  independent  from  the 
linear  restrictions.  This  independence  is  achieved  by  performing  the 
ranking  procedure  under  so-called  "restricted  admissibility"  rules:  In 
the  backward  ranking  method,  a  given  factorial  effect  is  considered 
admissible  for  ranking  only  when  all  effects  of  which  the  given  factorial 
effect  is  a  sub-effect  have  been  deleted  from  the  model.  For  example,  in 
the  two-way  crossed  classification  (both  factors  qualitative)  with  inter¬ 
action,  the  symbols  <7  and  8  are  contained  in  the  symbol  C/3  which  makes 
(I  and  8  sub-effects  of  c/9 .  Therefore,  the  main  effects  <7  and  8  will  not 
be  considered  for  ranking  before  the  interaction  C/9  has  been  ranked 
(deleted).  Thus,  in  the  two-way  crossed  classification,  ranking  under 
restricted  admissibility  rules  always  implies  ranking  the  interaction 
effect  c/9  os  least  important.  Correspondingly,  in  the  three-way  classi¬ 
fication,  the  interaction  effect  of  second  order,  <3 S3,  when  fitted,  will 
always  be  ranked  as  the  least  important  effect.  Once  that  is  done,  the 
interactions  of  first  order,  <39,  33,  and  &3,  become  admissible  for  ranking. 

The  main  effect  <7,  for  example,  would  become  admissible  only  after  C/9 
and  <73,  in  addition  to  <393,  had  been  ranked  (deleted).  In  other  words, 
according  to  the  backward  ranking  method  under  restricted  admissibility 
rules,  the  leaBt  Important  effects  are  always,  and  by  definition,  the 
interaction  effects  of  highest  order.  When  these  interaction  effects 
become  members  of  the  significant  model,  their  "sub-effects"  will  auto¬ 
matically  become  members  of  the  significant  model  too.  The  latter  merely 
reflects  what  the  statistician  is  always  aware  of:  If  the  interaction 
between  two  factorial  effects  (of  any  order)  is  significant,  then,  in 
general,  each  one  of  the  two  factorial  effects  themselves  is  significant 
at  least  at  one  level  (or  level  combination)  of  the  other  effcct(s).  For 
example,  take  the  case  of  &  significant  interaction  <39.  This  significance 
implies,  in  general,  that  factor  <7  has  a  significant  effect  upon  the 
response  variable  at  least  for  one  level  of  factor  8;  and  vice  versa,  that 
factor  8  has  a  significant  effect  upon  the  response  variable  at  least  for 
one  level  of  factor  <7.  (Even  in  "orthogonal"  analysis  of  variance  it  does 
not  make  sense  to  conclude  that  "05  is  significant  but  <7  and  3  arc  not 
significant",  merely  judging  from  the  F- tests  in  the  ANOVA  tabic.  Application 
of  t-teats  at  the  individual  levels  of  faevors  <7  and  8  will,  in  general, 
show  significant  effects.) 
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In  light  of  the  above  reasoning,  the  forced  Inclusion,  In 
the  significant  model,  of  factorial  effects  which  are  sub-cffects  of 
significant  interactions  does  not  appear  to  be  a  sericws  drawback  in  the 
establishing  of  a  significant  model  by  the  backward  ranking  method  under 
restricted  admissibility  rules. 

The  rules  of  restricted  admissibility  in  the  backward 
ranking  method  are  applied,  in  the  way  described  above,  to  all  qualitative 
factorial  effects  (represented  by  groups  of  DIVs  of  the  u-type  only,  see 
Section  £.1.1).  Since  no  linear  restrictions  are  applied  to  the  model 
constants  of  quantitative  factorial  effects,  the  problem  of  dependence 
upon  linear  restrictions  does  not  arise  with  these  effects. 
Consequently,  there  is,  in  general,  no  need  for  restricted  admissibility 
rules  in  the  ranking  of  factorial  effects  when  all  factors  in  the  model 
are  quantitative.  The  exception  is  when  the  analyst  wants  to  arrive  at 
a  significant  model  which  contains  all  polynomial  terms  of  the  quantitative 
factor  variables  having  lower  order  than  the  significant  terms.  NOVACOI 
does  provide  an  option  for  the  indicated  type  of  restricted  admissibility 
rules  in  the  ranking  of  effects  when  all  factors  in  the  mode^.  are 
quantitative.  For  txuspl*,  if  the  tern  kj  is  significant,  \  and  \  also 
would  become  terms  of  the  significant  model  under  this  option.  The  option 
automatically  also  applies  to  the  ranking  of  the  CIVs  (if  any  are  in  the 
model).  This  type  of  "restricted  admissibility"  (which  actually  is  the 
name  of  this  option  for  cues  with  quantitative  factors  or  CIVs  in  the 
model)  applies  also  to  all  cross  product  terms  and  can  generally  be 
defined  as  follows:  Under  the  option  of  "restricted  admissibility", 
only  those  CIVs  or  DIVs  (the  latter  being  powers  or  cross  product  terms 
of  quantitative  factor  variables)  are  admissible  for  ranking  at  a  given 
atep  which  are  not  "Sub-CIVs"  (to  be  defined)  or  "Sub-DIVs”  of  other 
CIVs  or  DIVs,  respectively,  contained  in  the  remaining  model  of  the  N ' 

IVs.  A  CIV  is  called  a  "Sub-CIV"  with  respect  to  another  CIV  when  the 
symbol  of  the  "8ub-CIV"  is  contained,  as  a  factor,  in  the  symbol  of  the 
other  CIV.  An  obVioui  corresponding  definition  applies  to  "fhib-DIVfl" 

(the  DIVs  being  mowers  or  product  t^rgss  of  quantitative  factor  variables). 

For  example,  Xj.^  i#  &  sub-CIV  of  X1X3.  X,  is  a  sub-DIV  of  .  One 
advantage  of  ranking  under  the  option  of  "restricted  admissibility"  is  that 
tbs  significant  model  becomes  invariant  with  respect  to  variable  trans¬ 
formations  (for  example,  when  replacing  Xy  by  x--Xy  for  reasons  concerning 
the  accuracy  of  the  matrix  Inversion).  For  further  discussion  see 
Reference  2  and  See  cion  3 -l. 3  on  the  use  of  the  ranking  options  in  the 
present  report.  (Note.  The  program  U3er  may  choose  the  option  "unrestricted 
admissibility"  when  he  does  not  desire  to  rank  the  quantitative  factorial 
effects  and  CIVs  under  the  option  "restricted  admissibility"  Just  described. ) 

In  the  model  of  the  type  (2-8),  which  contains  both 
qualitative  and  quantitative  factorial  effects,  ranking  under  rules  of 
restricted  admissibility  will  imply  a  logical  combination  of  the  rules 
outlined  above  separately  for  each  one  of  the  two  types  of  ANOVA  effects. 
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For  example,  in  the  two-way  crossed  classification,  the  interaction  effect 
of  the  linear  component  of  (quantitative)  factor  (3  with  (qualitative) 
factor  8  will  bear  the  symbol  \B-  Under  both  options  of  restricted  and 
unrestricted  admissibility  X,  and  8  are  sub-effects  of  \8  and  will  not 
become  admissible  for  ranking  until  \8  has  been  ranked,  i.e.,  deleted 
from  the  model.  Also,  would  not  be  admissible  as  long  as  X£$  is 
contained  in  the  remaining  model  (consisting  of  N'  IVs).  In  the  same 
example,  all  interaction  effects  \8,  Xjfl,  will  be  admissible 

at  the  first  step  of  ranking  when  the  option  "unrestricted  admissibility" 

1b  chosen,  which  here  actually  means  "relaxed  admissibility"  since  the 
admissibility  restrictions  originating  from  the  presence  of  the 
qualitative  effects  still  do  exist*  In  the  case  discussed  before  when 
the  analyst  wants  to  keep  all  polynomial  terms  of  lower  order  (than  the 
order  of  the  significant  ones)  In  the  significant  model,  the  user  chooses 
the  option  for  "restricted  admissibility."  Under  this  option,  in  the 
above  example,  at  the  first  step  only  would  be  admissible,  followed 

by  X^-a5  at  the  second  step,  and  so  forth. 

SuBsnarizing,  there  are  two  options  for  admissibility  when 
the  NOVACOM  model  contains  quantitative  factors  or  CIVs:  "restricted 
admissibility"  according  to  which  all  lower  order  polynomial  terms  are 
kept  in  the  model  all  the  time,  and  "unrestricted  admissibility"  according 
to  which  the  lower  order  terms  are  not  necessarily  kept  In  the  model. 

When  applied  to  the  ANOVA  part  of  the  model  only,  "unrestricted 
admissibility"  is  referred  to  as  "relaxed  admissibility."  For  more 
details  on  the  admissibility  options  see  Section  2.2. 

The  rules  for  ranking  factorial  effects  under  restricted 
admissibility  stated  so  far,  if  adhered  to,  assure  the  Independence  of  the 
additional  regression  sums  of  squares,  SS*„„.  ,  from  the  linear  restrictions 
chosen  when  all  cells  of  the  data  layout  are  occupied.  Fcr  the  model  of 
the  type  (2-5)  with  only  qualitative  factorial  effects,  adherence  to  the 
rales  aoinires  this  independence  also  for  data  layouts  with  empty  cells. 
However,  for  tee  model  of  the  type  (2-8)  with  both  qualitative  and 
quantitative  factorial  effects,  the  established  rules  are  not  sufficient 
to  assure  the  independence  in  case  of  empty  cells.  For  this  situation, 
this  author  has  not  yet  been  able  to  completely  define  the  pattern  of  the 
restriction  dependence  of  the  additional  regression  sums  of  squares. 

As  a  safeguard  in  this  case,  a  procedure  is  used  in  NOVACOM 
which  i3  ovcrconservacivc  in  its  restrictions  on  the  admissibility  but 
asGurec  l he  independence  of  the  additional  regression  sums  of  squares  from 
the  linear  restrictions  Imposed  on  the  model  constants.  The  procedure 
essentially  consists  of  treating  certain  interaction  effects  as  if  they 
were  Interactions  between  qualitative  factors  only.  For  a  formal 
definition  of  these  "partially  fitted  full  effects",  as  they  urc  called, 
and  of  the  procedure  indicated,  see  Section  2.2.2. 


17 


A  final  remark  in  this  discussion  of  the  backward  ranking 
method  concerns  the  fact  mentioned  before  that  the  (cumulative)  ranking  is 
not  terminated  when  the  "significant  model"  is  reached  but  is  continued 
until  all  factorial  effects  have  been  ranked.  This  continued  ranking 
through  the  sigHf leant  model  serves  two  purposes:  (1)  The  analyst  will 
obtain  &  ranking  of  the  factorial  effects  which  are  contained  in  the 
significant  model,  i.e.,  he  will  know  the  relative  importance  of  the 
effects  in  the  significant  model;  and  (2)  he  will  get  an  idea  of  what 
his  significant  model  would  have  looked  like  had  he  chosen  a  different 
significance  level  a  for  the  determination  of  the  significant  model. 

The  second  purpose  1%  to  a  certain  degree,  also  served  by  the  provision 
given  in  NOVACCM  to  actually  choose  three  a  values  for  three  different 
significant  models,  where  a  full  printout  of  all  pertinent  data  is  given 
for  each  of  these  significant  models. 

Ranking  through  the  significant  model  sometimes  leads  to 
I(X)-values  which  are  so  small  that  they  are  far  beyond  the  accuracy 
limits  of  the  subroutine  ISUBX  which  computes  I(X).  In  order  to  be  able 
to  rank  the  factorial  effects  of  the  significant  model  in  this  case,  a 
provision  is  made  in  the  program  to  automatically  redefine  the  error  sum 
of  squares  by  one  of  three  pooling  procedures.  These  pooling  procedures 
(marked  by  one,  two,  or  three  "  * "-signs  attached  to  the  step  number  of 
the  ranking)  increase  the  error  sum  of  squares,  thereby  decreasing  the 
F-v©.lue  of  (2-10)  and  increasing  the  l(X)-value  according  to  (2-12).  Of 
the  three  pooling  procedures  the  first  one  (the  "  '•'"-procedure)  is  identical 
to  the  single  dropping  procedure.  The  "  +*"-  and  "  +++ "-procedures  are  net 
justifiable  from  a  theoretical  point  of  view  and  are  just  "emergency" 
measures  to  ensure  a  complete  ranking  in  all  cases.  For  mere  details,  see 
Sections  2.3-2  and  2.h . 

2.1.3  Accuracy  Checks  on  Matrix  Inversions 

Since  the  method  of  NOVACCM  is  based  on  the  general  linear 
model  (2-1),  the  accuracy  of  the  results  is  dependent  upon  the  accuracy 
of  the  inversion  of  the  matrix  of  the  normal  equations  of  rank  N+l  and  of 
all  matrices  of  smaller  rank  to  be  inverted  at  subsequent  steps  of  the 
ranking.  The  matrices  may  be  singular  (by  faulty  fitting  of  IVs)  or  they 
may  be  ill-conditioned  such  that  the  inverses  are  fictitious  or  inaccurate, 
respectively.  The  procedure  used  in  NOVACCM  to  check  on  the  validity  of 
the  inverses  is  essentially  that  of  the  program  DA-MRCA  (see  Reference  2). 
The  main  features  of  the  procedure  are  (l)  computation  of  the  matrix 
Ic  -  where  A  is  the  matrix  of  the  coefficients  of  the  normal  equations 

and  A-1  is  its  inverse,  and  (2)  comparison  of  the  main  diagonal  elements  of 
Ic  with  those  of  the  unit  matrix  I.  If  any  one  of  the  deviations  j  i^-l  | 
is  larger  than  a  small  input  value,  "T0LI2",  where  the  iyV  are  the  main 
diagonal  elements  of  I„  ,  the  inverse  A-1  is  rejected  and  a  new  inversion 
is  tried  after  a  specified  admissible  CIV  or  admissible  group  of  DIVs  has  , 
been  deleted  from  the  model.  This  procedure  is  continued  until  the  first 
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time  an  acceptable  inverse  A-1  (corresponding  to  an  acceptable,  or  "good" 
model)  is  found.  The  step  in  the  ranking  method  when  the  "first  good  model" 
is  fcund  is  called  the  "first  good  step."  Once  the  first  good  step  has 
been  reached,  no  further  accuracy  checks  on  the  matrix  inversions  are 
performed.  The  reason  is  that,  at  the  first  good  step,  all  singularities 
must  necessarily  have  been  eliminated,  and  that  the  accuracy  of  the 
inversion  is  assumed  to  improve  with  the  monotcnic  decrease  of  the  rank 
of  the  matrix  at  subsequent  steps. 

The  reasons  for  checking  only  the  main  diagonal,  elements 
of  the  matrix  1.  is  derived  from  the  fact  that  the  off-diagonal  elements 
of  1^  are  not  necessarily  indicative  of  the  accuracy  of  the  inversion; 
see  Section  VT.l.b  of  Reference  2. 

Two  preliminary  checks  are  exercised  before  the  procedure 
described  above  is  executed:  (l)  a  check  whether  the  determinant  of  the 
matrix  A  is  non-positive  and  (2)  a  check  whether  an  element  of  the  main 
diagonal,  of  the  inverse,  A"1 ,  is  negative.  Should  any  one  of  the  two 
events  happen,  the  model  of  the  corresponding  step  in  the  ranking  is 
rejected  without  performing  the  remaining  check(s). 

2.1.4  Printout  and  Comprehensive  Analyses 

The  program  gives  a  "full  printout"  of  all  pertinent  data 
at  selected  steps  of  the  ranking.  The  selected  steps  include  the  "first 
good  step"  and  each  step  at  which  a  significant  model  according  to  one 
of  the  specified  or-values  is  reached.  (As  mentioned,  up  to  three  such 
a-values  may  be  specified  of  which  one  is  to  bi  defined  as  KALFK\  in  case 
both  CCMO  and  FEMO  are  to  be  run.  The  same  set  of  o-values  is  used  for 
booh  the  cumulative  and  the  single  dropping  procedure  and  in  C®0  as  well 
as  in  FEMO. )  The  i\ill  printcut  consists  essentially  of  the  following: 

The  elements  cvv.  of  the  Inverse  matrix,  A-1  ;  the  value  of  the  determinant 
of  A;  the  estimated  regression  coefficients,  by,  and  their  standard 
deviations,  s/CyV ,  where  s  is  the  square  root  of  the  estimated  error 
variance^ for  the  given  step;  the  predicted  values,  Y,  and  the  prediction 
errors,  e;  the  prediction  error  frequency  distribution  and  the  results 
of  the  calculation  of  the  yf  -test  for  normality  on  the  prediction  errors. 

The  full  printout  for  any  one  of  the  three  or-' values  in  both  cumulative 
and  single  dropping  should  provide  the  program  user  with  all  information 
for  the  model  he  decides  to  use  as  the  "significant  model." 

In  audition  to  the  full  printout,  a  complete  identification 
of  the  data  input  is  printed  at  the  beginning  of  each  problem,  including  a 
list  of  the  DIVs  and  CIVs;  tne  observed  values  of  the  quantitative  factor 
variables,  of  the  OCIVs  and  the  dependent  variable(s);  all  averages  and 
various  other  statistics;  and  the  summation  matrix  (the  latter  consisting 
of  the  matrix  A  and  the  cross  product  terms  wit'x  the  y’s). 
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For  every  step  of  the  ranking,  the  I(X)-valces  and  their 
arguments  (a  =>  b  -  %fx)  are  printed,  plus  an  identification  of  the 
admissible  CIVs  or  effects.  The  latter  enables  the  program  user  to  follow 
the  ranking  process  in  detail  and  to  see  how  the  program  arrived  at  the 
established  ranking  order  of  CIVs  and/or  factorial  effects. 

At  the  end  of  each  problem  a  "Final  Comprehensive  Analysis", 
("FCA"),  is  printed  which  gives  the  results  of  the  ranking  in  condensed 
form.  There  is  an  FCA  printed  for  both  cumulative  and  single  dropping  in 
case  the  option  of  also  performing  the  single  dropping  procedure  has  been 
chosen.  Each  line  of  the  FCA  corresponds  to  one  step  of  the  ranking 
procedure,  and  the  following  sure  some  of  the  more  important  items  printed: 
The  symbol  of  the  CIV  or  effect  ranked  at  this  step;  a  "Procedure’’  - 
symbol  (PRC)  consisting  of  an  asterisk  when  this  step  corresponds  to  one 
of  the  three  specified  significance  levels  a;  the  I(X)-value;  the  two 
mean  squares  in  the  F-test  plus  their  degrees  of  freedom;  and  the 
coefficient  of  determination.  See  Section  3-5-1  for  a  detailed  discussion 
of  the  FCA.  The  ANVAs,  when  applicable  (i.e.,  when  there  were  significant 
covariatea  in  an  analysis  of  covariance  model  and  when  the  AKVA  option  was 
chosen),  are  also  printed  in  the  form  of  FCAs  at  the  end  of  a  problem. 

When  several  Control  Card  4  Sets  have  been  run,  a  "Final 
FCA"  is  printed,  repeating  the  FCAs  from  each  problem  in  order  to  facilitate 
a  convenient  search  for  the  most  probable  significant  model.  See  Section 
3.4.5  for  an  example  of  how  to  use  such  a  Final  FCA. 

2.2  Automatic  Generation  and  Controls 


The  present  section  contains  a  description  of  the  automatic 
generation  of  the  model  terms  (CIVs,  DIVs,  effects),  the  controls  over 
the  admissibility  of  these  terms  during  the  rankirg  processes,  and  the 
generation  of  the  design  matrix  in  NOVACOM.  The  notation  used  is  that 
which  is  also  printed  by  the  program  in  the  identification  of  CIVs,  DIVs 
and  factorial  effects.  Together  with  the  contents  of  the  next  section  (2.3), 
the  contents  of  the  present  section  originally  served  as  the  programmer's 
information  for  coding  NOTACOt. 

2.2.1  Generation  Bind  Admissibility  of  CIVs 

First,  the  notation  will  be  introduced  which  is  used  for 
the  CIVs  (concomitant  independent  variables)  in  the  analysis  of  covariance 
part  of  the  NOVACCM  model. 

OCIVs  ("Original"  CIVs)  i.e.,  CIVs  of  which  a  physically 
observed  value  exists  for  each  observed  value  of  the  dependent  variable,  y, 
are  denoted  by  their  cardinal  numbers  followed  by  a  "l"  in  parentheses, 
indicating  the  first  power,  that  is,  the  OCIV  itself:  1(1),  2(1),  3(1), 
etc.,  corresponding  to  xx,  x2,  x3,  etc.,  respectively,  in  the  usual  notation. 
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OCIVa  ("Generated"  CIVs)  which  are  powers  of  OCIVs  are  denoted  by  the 
cardinal  number  of  the  OCIV  followed,  in  parentheses.,  by  the  power  to 
which  the  OCIV  is  raised;  for  example,  2(2),  4(6),  etc.,  corresponding 
to  Xg,  x*,  etc.,  respectively.  GCIVs  which  are  products  of  powers  of 
OCIVs  are  formed  by  connecting  CIVs  by  "x's",  for  example,  1(1)  x  £(J?) 
corresponding  to  XiJ&,  and  2(1)  x  3(2)  x  4(6)  corresponding  to  X2X3X4. 
Accordingly,  the  following  definitions  we  introduced: 

A  "CIV"  is  identified  by  one  or  more  pairs  of  numbers,  the  pairs 
being  connected  by  "x's".  The  first  number  in  each  pair  is  called 
the  "cardinal  number"  or  "OCIV-number" ;  the  second  number  is  put  in 
parentheses  and  is  called  the  "power." 

Quite  often  the  user  will  want  to  fit  a  model  of  order 
p  >  1  in  the  concomitant  variables.  In  general,  thiB  would  mean  that 
all  possible  terms  up  to  order  p  of  the  OCIVs  are  desired  in  the  model. 

For  example,  with  two  OCIVs,  1(1)  and  2(1),  a  complete  model  of  order 
p=2  would  include  the  following  5  CIVs:  1(1),  2(1),  1(2),  2(2),  1(1)  x  2(1) 
(In  usual  notation,  these  are  x1}  x2,  x?.,  x?>,  XiXjs.)  The  generation  of  a 
pth  order  model  is  done  automatically  in  NOVACCM  according  to  the  order 
p=P  put  in  columns  14  +  15  of  Control  Card  No.  1.  (See  Section  3.1.1.) 

With  TP  (columns  12  +  13,  Control  Card  No.  1)  being  the  number  of  OCIVs, 
the  program  will  generate  a  total  number  of  GCIVs  equal  to: 


T  -  TP  = 


-  I 


TP, 


and.  the  total  number  of  CIVs  will  be 


(2-13) 


T  =  I  TP  +  3  -  1  .  (2-14) 

J-l\  3  / 

In  the  following  Table  2.1,  a  scheme  is  given  for  the 
generation  of  an  example  model  of  order  p=3  with  TP=4  OCIVs.  The  first 
four  columns,  one  each  for  each  OCIV,  give  the  powers  to  which  the  OCIVs 
are  raised  to  form,  after  multiplication,  the  CIV  given  in  the  fifth 
column.  The  last  column  gives  the  number  of  CIVs  in  each  order-group- 


PC IV  No. 

XXX 


ClV-S.vmbol 


¥ 


10  0  0 
0  1  0  0 

O  0  1  o 

0  0  0  1 


1(1) 

2(1) 

5(1) 

MD 


2  0  0  0 

110  0 
10  10 
10  0  1 
0  2  0  0 

Olio 
0  1  o  1 

0  0  2  0 

0  0  11 

0  0  0  2 


1(2) 

1(1)  X  2(1) 
1(1)  x  3(1) 
1(1)  x  Ml) 
2(2) 

2(1)  x  3(1) 
2(1)  x  4(1) 
5(2) 

3(1)  x  4(1) 
4(2) 


3  0  0  0 

2  10  0 

2  0  10 

2  0  0  1 

12  0  0 

1110 
1101 
10  2  0 

1011 
10  0  2 

0  3  0  0 

0  2  10 

0  2  0  1 

0  12  0 

0  111 
0  10  2 

0  0  3  0 

0  0  2  1 

0  0  12 

0  0  0  3 


1(3 
1(2)  > 
1(2)  1 

1(2)  3 

Ml)  3 

1(1)  3 

MD 
KD 
KD 
KD  : 
2(3) 
2(2) 
2(2 
2(1) 
2(1) 
2(1) 
3(3) 
3(2) 
3(1) 
4(3) 


2(1) 
3(D 
4(1) 
2(2) 
2(1)  : 
2(1) 
3(2) 
3(D 
4(2) 

;  3(1) 

:  4(1) 

:  3(2) 

:  3(1) 

:  M2) 

c  4(1) 

c  M2) 


3(1) 

4(1) 

4(1) 


MD 


Table  2.1 
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In  addition  to  the  automatic  generation  feature  for  CIVs, 
NOVACCM  provides  options  to  delete  CIVs  from  the  automatically  generated 
se  or  to  add  CIVs  to  this  set.  The  latter  mode  of  specifying  the  CIV- 
part  of  the  NCVACCM  model  may  also  be  UDed  to  input  the  entire  set  of 
CIVs  by  hand.  The  individual  CIVs  to  be  deleted  or  generated  will  be 
put  on  Control  Card  No.  5  (see  Section  3. 1.1)  i.i  the  notation  as  described 
above.  The  various  options  are  provided  to  make  possible  the  economic 

generation  of  the  CIV-part  of  the  model.  For  the  use  of  the  options  see 
Section  3‘1*2. 

As  mentioned  before,  the  program  user  has  the  choice  between 
"restricted  admissibility"  and  "unrestricted  admissibility"  of  the  CIVs 
in  the  ranking  process  of  the  CIVs  in  CCMO.  (See  column  38  of  Control 
Card  No.  1,  Section  3.1.1.)  "Unrestricted  admissibility"  simply  means  that 
all  CIVs  not  yet  ranked  at  a  given  step  of  COMO  are  admissible  for  ranking 
at  that  step. 


The  restricted  admissibility  option  is  governed  by  the 
following  definitions: 

At  a  given  step  of  COMO  only  those  CIVs  are  admissible  for  ranking 
which  are  not  "Sub-CIVs’'  of  other  CIVs  not  yet  ranked.  A  CIV  is 
a  "Sub-CIV"  of  another  CIV  when  (l)  for  each  cardinal  number  in 
the  symbol  of  the  (sub-)  CIV  there  is  the  same  cardinal  number 
present  In  the  symbol  of  the  other  CIV,  and  (2)  the  powers  of  the 
(sub-)  CIV  are  not  larger  than  the  corresponding  ones  of  the  other 
CIV. 


In  order  to  illustrate  the  above  definitions,  take  the 
example  of  the  model  of  order  p=3  given  above.  For  simplification,  assume 
that  the  GCIVs  of  order  p=3  except  1(3),  2(3),  3(3),  and  4(3)  have  been 
deleted  from  the  complete  set  of  T=34  CIVs  by  means  of  Control  Card  No.  5. 
Then,  the  relations  between  the  CIVs  are  as  given  in  Table  2.2.  In  this 
example,  therefore,  the  following  10  CIVs  would  be  admissible  at  the  first 
step  of  CCMO,  under  the  option  of  restricted  admissibility:  1(1)  x  2(1), 
1(1)  X  3(1),  1(1)  X  4(1),  2(1)  X  3(1),  2(1)  x  4(1),  3(1)  X  4(1),  1(3), 

2(3),  3(3),  4(3). 

2.2.2  Generation  of  DIVs  and  Admissibility  of  Factorial  Effects 

The  notation  for  the  DIVs  (design  independent  variables) 

In  the  analysis  of  variance  part  of  the  NOVACOM  model  is  based  on  the 
following  definitions.  A  DIV  is  defined  to  be  of  order  d,  where  d  is  the 
order  of  the  factorial  effect  which  the  DIV  is  representing  (possibly 
together  in  a  group  with  other  DIVs).  For  example,  the  DIVs  of  main  effects 
have  order  I;  the  DIVs  of  two-factor  interactions  have  order  2;  and  in 
general,  the  DIVs  of  interactions  between  d  factors  have  order  d.  The 
symbol  of  a  DIV  contains  d  pairs  of  numbers,  where  the  pairs  are  connected 
by  "x's".  Each  pair  of  numbers  stands  for  a  factor  of  the  crossed 
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1(1)  is  Sub-CIV  to  1(2) 

1(1)  x  2(1) 

1(1)  x  3(1) 
1(1)  x  4(1) 
1(3) 

2(1)  is  Sub-CIV  to  1(1)  x  2(1) 

2(2) 

2(1)  x  3(1) 
2(1)  x  4(1) 
2(3) 

3(1)  is  Sub-CIV  to  1(1)  x  3(1) 

2(1)  x  3(1) 
3(2 

3(1)  x  4(1) 
3(3) 

4(1)  is  Sub-CIV  to  1(1)  x  4(1) 

2(1)  x  4(1) 
3(1)  x  4(1) 
4(2) 

4(3) 


1(2) 

is 

Sub-CIV 

to 

1(3) 

1(1)  X 

2(1) 

is 

Sub-CIV 

to 

NCNE 

1(1)  X 

3(1) 

is 

Sub-CIV 

to 

NCNE 

1(1)  X 

4(1) 

is 

Sub-CIV 

to 

NONE 

2(2) 

is 

Sub-CIV 

to 

2(3) 

2(1)  x 

3(1) 

is 

Sub-CIV 

to 

NONE 

2(1)  x 

MD 

is 

Sub-CIV 

to 

NCNE 

3(2) 

is 

Sub-CIV 

to 

3(3) 

3(1)  x 

4(1) 

is 

Sub-CIV 

to 

NONE 

4(2) 

is 

Sub-CIV 

to 

4(3) 

K3) 

is 

Sub-CIV 

to 

NONE 

2(3) 

is 

Sub-CIV 

to 

NONE 

3(3) 

is 

Sub-CIV 

to 

NCNE 

4(3) 

is 

Sub-CIV 

to 

NCNE 

Table  2.2 
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classification,  and  the  first  number  in  the  pair  is  the  cardinal  number 
of  the  factor,  to  be  called  the  "factor  number."  For  example,  the  factor 
numbers  1,  2,  3,  ...  correspond  to  the  usual  factor  symbols  Gt  B,  {J,  . .., 
respectively.  The  second  number  in  each  pair  is  connected  with  the  first 
by  an  asterisk  when  the  factor  is  qualitative  and  equals  the  level  number 
of  that  factor.  The  second  number  of  the  pair  ia  connected  with  the  first 
by  a  point  when  the  factor  is  quantitative  and  equals  the  power  to  which 
the  quantitative  factor  variable  is  raised.  Accordingly,  one  has  the 
following  definitions: 

First  number  in  "factor  pair"  ■  "factor  numb*.*" 

Second  number  in  "factor  pair"  ■  "level  number"  when  factor  qualitative 

Second  number  in  "factor  pair"  ■  "power"  when  factor  quantitative. 

For  example,  in  a  three-way  crossed  classification  (with  factors  Qt  8,  and 
fl-  all  at  three  levels,  say)  factor  No.  1  {t 7 )  may  be  qualitative,  factors 
No.  2  and  No.  3  (8  and  £•)  may  be  quantitative.  Then,  typical  DIVs  of 
order  1  are  1*1  and  1*2,  where  these  two  "factor  pairs"  are  the  two  DIVs 
representing  the  main  effect  of  tf,  or  (qualitative)  factor  No.  1.  Also, 
the  factor  pairs  2.1  and  2.2  are  the  two  DIVs  of  order  1  corresponding 
to  the  first  and  second  power  of  the  quantitative  factor  variable  of 
factor  No.  2.  (In  the  notation  of  Section  2.1.1,  these  twe  DIVs  stand 
for  Xb  and  ,  respectively.)  The  DIVs  representing  the  interaction 
between  the  first  and  the  second  factor  are  then  1*1  x  2.1,  1*2  x  2.1, 

1*1  x  2.2,  and  1*2  x  2.2.  DIVs  of  higher  order  are  formed  in  a  corre¬ 
sponding  manner. 

Whereas  in  the  generation  of  CIVs  the  program  user  may  want 
to  generate  a  model  which  contains  all  CIVs  up  to  a  given  order  P,  the  user 
will  want,  in  general,  to  generate  a  model  containing  all  DIVs  up  to  a  given 
order  d=D,  say.  This  order,  D,  in  general,  will  be  equal  to  the  number  of 
factors  in  the  data  layout  to  be  analyzed.  Accordingly,  NOVACCM  provides 
an  option  to  generate  all  DIVs  up  to  a  specified  order  D  (columns  4-5, 
Control  Card  No.  1;  see  Section  3*1-1)-  The  total  number  of  DIVs  generated 
under  this  option  depends  upon  the  numbers  of  levels  of  the  factors.  These 
numbers  are  input  on  Control  Card  No.  2  (see  Section  3-1-1)-  According 
to.  the  linear  restriction  (2-4)  introduced  in  Section  2.1.1,  the  "level 
number "  in  a  "factor  pair"  representing  a  qualitative  factor  can  only  go 
tip  to  one  less  than  the  total  number  of  levels  of  the  factor.  For  example, 
in  the  ease  of  3  factors,  where  the  first  factor  has  4  levels  and  is 
qualitative,  and  where  the  second  and  third  factors  have  2  and  3  levels, 
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*espcctively,  and  are  quantitative,  the  second  numbert  In  the  factor  palra 
will  go  up  to  3,  1,  and  2,  retpectlvely.  Therefore,  If  the  specified 
order  of  the  model  to  be  automatically  generated  it  D*2,  the  model 
generated  will  include  the  (3+1+2)  +  (3*l+3‘2+l'2)  =  17  DIVs  Hated 
in  Table  2.3. 


DIV  -  Symbol 


S 

1*1 

1*2 

► 

2.1 

3*1 

3.2 

1*1  x  2.1 

* 

1*2  x  2.1 

1*3  *  2,1 

i*l  x  3.1 

1*1  x  3.2 

1*2  x  3.1 

► 

1*2  x  3-2 

1*3  x  3-1 

1*3  x  3-2 

2.1  x  3.1 

2.1  x  3-2 

3  +  1  +  2  *  6  DIVb  of  Order  1 


5-1  +  3*2  +  1*2  -  11  DIVs  of  Order  2 


Table  2-3 


As  can  easily  be  seen,  the  DIVi  of  order  d  >  1  can  be  generated  by  forcing 
all  possible  products  among  DIVs  of  order  d-1  with  unequal  "factor  numbers" 
(first  numbers  in  the  pairs). 

Corresponding  to  the  CIV  generation  discussed  earlier,  the 
program  provides  options  to  delete  DIVs  from  the  automatically  generated 
set  or  to  add  DIVs  to  this  set.  The  individual  DIVs  to  be  deleted  or 
generated  will  be  put  on  Control  Card  No.  U  in  the  notation  described 
above,  lor  the  use  of  the  generation  options  see  again  Section  3-1-2. 

The  fol1  iwing  definitions,  which  are  essential  for  the  FEMO 
part  of  the  analysis,  refer  to  the  final  set  of  DIVs  as  generated  after 
the  application  of  .  ^e  generation  optlon(s)  described  above. 

An  "effect"  (factorial  effect,  that  is)  Is  defined  as  the  group 
of  all  DIVs  of  equal  order  which  (1)  have  equal  "factor  numbers", 
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factor  by  factor,  and  (2)  have  equal  "powers"  In  each  (quantitative) 
"factor  pair",  pair  by  pair.  (That  la,  the  only  quantities  which 
vary  In  the  symbols  of  the  DIVs  representing  an  "effect"  are  the 
"level  numbers.") 

Groups  of  DIVs  representing  effects  according  to  the  above 
definition  are  symbolized  by  replacing  the  factor  pairs  cf  the  qualitative 
factors  by  the  factor  numbers  alone.  For  example,  the  group  of  DIVs 
representing  the  x  iSj  .  ^  interaction,  when  (qualitative)  factor  ct 
has,  say,  h  levels,  that  is,  the  3  DIVs,  ]*1  x  2.1,  1*2  x  2.1,  >3  x  2.1, 
is  symbolized  by  1  x  2.1. 

The  number  of  DIVs  in  a  group  representing  an  "effect" 
equals  the  degrees  of  freedom  of  that  effect. 


As  discussed  in  Section  2.1.2,  the  program  user  may  choose 
between  two  options  which  control  the  admissibility  of  effects  for  ranking 
at  a  given  step  of  F1MO:  "restricted  admissibility"  and  "relaxed 
admissibility."  The  two  options  concern  AN CVA  models  which  contain  at 
least  one  quantitative  factor,  but  the  admissibility  rules  in  both  options 
cover  also  the  case  of  only  qualitative  factors  In  the  model.  (The  two 
options  are  coupled  with  the  options  for  "restricted  admissibility"  und 
"unrestricted  admissibility"  of  CIV«,  respectively,  and  are  controlled 
by  the  same  program  variable,  "CAD",  see  column  30  of  Control  Card  No.  1. ) 

The  following  definitions  apply  wh  i  there  are  no  empty 
cells  in  the  original  and/or  marginal  data  classif. -cations  for  which 
model  terms  (DIVs)  are  fitted. 

Under  "restricted  admissibility",  an  effect  is  admissible  for  ranking 
at  a  given  step  when  it  is  not  a  "sub-effect"  of  ether  effects  r.ct 
yet  ranked  (i.e.,  of  effects  still  contained  in  the  model  of  the  N ' 
IVa).  An  effect  is  a  sub-effect  of  another  effect,  (1)  when  for  each 
factor  number  in  the  symbol  of  the  (sub-)  effect  there  is  the  same 
factor  number  present  in  the  symbol  of  the  other  effect,  and  (2)  when 
the  powers  in  the  quantitative  factor  pairs  of  the  (sub-)  effect  are 
not  larger  than  the  corresponding  ones  of  the  other  effect. 

(Note.  In  Section  2.1.2,  sub-cffects  represented  by  one 
DIV  were  also  referred  to  as  "sub-DIVs.") 


For  instance,  in  the  example  of  the  three-way  crossed 
clascifleat  ion  discussed  earlier,  where  qualitative  factor  (J  had  I4  levels 
and  quantitative  factors  9  and  C  had  2  and  3  levels,  respectively,  effect 
'  or  1  x  3.1,  under  restricted  admissibility  is  u  sub-clTc-i  of 
1  x  3-2,  1  x  2.1  x  3*1,  and  1  x  2.1  x  3-2.  Only  when  the  last  three  effects 
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have  been  ranked  (l.e.,  deleted  frcm  the  model  of  the  N'  IVs),  does 
lx^.l  become  admissible  for  ranking.  Also  in  thin  example  for  restricted 
admissibility,  3-1  is  a  sub-effect  of  5-2,  1  x  J.l,  1  x  }.2,  1  x  2.1  x  J.l, 
and  1  x  2.1  x  J. 2.  As  another  example,  in  the  three-way  crossed  classifi¬ 
cation  with  qualitative  factors  Cf,  Q,  and  <3.,  effects  1,  2,  3,  1  x  2,  1  x  3> 
and  2x3  arc  sub-effects  of  1  x  2  x  3*  Once  1x2x3  is  ranked  (i.e., 
deleted  from  the  model),  1x2,  1  x  3,  and  2x3  become  admissible.  (Note 
that  in  the  present  example  of  qualitative  factors  only,  effects  of  equal 
order  cannot  be  sub-effects  of  each  other.) 

Under  "relaxed  admissibility11  an  effect  is  admissible  for  ranking  at 
a  given  step  when  It  la  not  a  sub-effect  of  other  effects  still 
contained  in  the  model  of  the  N'  IVs  where  a  "sub-effect"  is  now 
defined  as  follows:  An  effect  is  a  sub-effect  of  another  effect 
(1)  when  for  each  factor  number  in  the  symbol  of  the  (sub-)  effect 
there  is  the  same  factor  number  present  in  the  symbol  of  the  other 
effect  and  (2)  when  the  powers  in  the  quantitative  factor  pairs  of 
the  (sul-)  effect  are  not  larger  than  the  corresponding  ones  of  the 
other  effect,  and  (3)  when  the  (sub-)  effect  is  of  lower  order  than 
the  other  effect  and  (4)  when  the  symbol  of  the  other  effect  contains 
at  least  one  qualitative  factor. 

For  instance,  in  the  example  mentioned  before,  where 
qualitative  factor  (2  haa  4  levels  and  quantitative  factors  &  and  d  have 
2  and  3  levels,  respectively,  effect  <3  x  C.  ,  r  »  or  1  x  J.l,  under 
relaxed  admissibility,  is  a  sub-effect  of  1  x  2.1  x  3.1  and  1  x  2.1  x  3 .?• 
When  the  latter  two  effects  are  deleted  frco  the  model,  1  x  3-1  becomes 
admissible  for  ranking.  Also  under  relaxed  admissibility,  3.1  Is  a  sub¬ 
effect  of  1  x  3-l>  1  x  3.2,  1  x  2.1  x  3-l»  and  1  x  2.1  x  3.2. 

According  to  the  definition  of  relaxed  admissibility,  in  ati 
ANQVA  model  containing  only  quantitative  factors  all  effects  are  admissible 
for  ranking  at  the  first  step  and  at  all  subsequent  steps  of  FEMO.  That 
is,  for  data  classifications  with  only  quantitative  factors,  "relaxed 
odmiaelbility"  corresponds  to  "unrestricted  admissibility"  in  the  ranking 
of  CIVa  in  COHO. 

When  there  are  empty  cells  in  the  original  and/or  marginal 
data  classifications  for  which  DIVs  are  fitted,  with  both  quantitative 
and  qualitative  factors  present,  the  following  definitions  apply. 

A  "fuJ-1  effect"  is  defined  as  the  group  of  all  those  "effects"  of 
equal  order  (containing  both  qualitative  and  quantitative  factors) 
which  have  equal  "factor  numbers",  factor  by  factor,  when  a  complete 
set  of  DIVs  haa  been  generated  and  no  DIV  has  been  deleted.  The 
number  of  DIVa  in  a  "full  effect",  or  the  degrees  of  freedom  of  the 
full  effect,  le  defined  as  the  product  of  the  d  factor  level  numbers, 
each  reduced  by  one. 
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For  instance,  In  the  example  withA-4,  B«2,  C«?  discussed 
before,  the  full  effect  Q  x  8  x  fi.  is  represented  by  (4-l)(2-l)(3-l)  -  6 
DIVs,  i.e.,  has  6  degrees  of  freedem. 

A  "pstrtlaiiv  fitted  full  effect"  ("PITE")  Is  defined  ea  being  the 
group  of  all  those  effects  of  equal  order  which  have  equal  factor 
munbera,  factor  by  factor,  of  which  at  least  one  factor  must  be 
quantitative,  and  where  at  least  one  DIV  is  missing  preventing  the 
group  from  being  a  "full  effect." 

For  instance,  in  the  above  example,  c7  x  8  x  C  is  e.  "KITE" 
when  the  effect  1  x  2.1  x  J.2  (represented  by  3  DIVs  =  3  degrees  of 
freedom)  is  not  fitted  because  of  the  presence  of  empty  cells. 

The  following  admissibility  rule  applies,  rio  matter  whether 
the  program  user  chooses  the  option  of  "restricted  admissibility"  or  that 
of  "relaxed  admissibility": 

In  case  of  a  data  classification  with  empty  cells  where  the  set  of 
factorial  effects  contains  ITFEs,  an  effect  is  admissible  for  ranking 
at  a  given  step  when  it  is  not  a  sub-effcct  of  a  FFFE  which  is  still 
contained  in  the  model  of  the  N'  IVs.  An  effect  is  a  "sub-effect" 
of  a  "FITE"  when  lor  each  factor  number  of  the  (sub-)  effect  there  is 
the  same  factor  number  present  in  the  PFFE,  and  where  the  order  of  the 
(sub-)  effect  is  smaller  than  that  of  the  PFFE. 

According  to  the  above  definition,  individual  effects  within  a  FFFE  are 
not  sub-effects  of  that  FFFE.  In  the  above  example,  where  the  effect 
1  x  2.1  x  3-2  was  assumed  to  bo  excluded  from  the  full  effect  a  x  8  x  <3, 
the  effect  1  x  2.1  x  3.1,  for  instance,  is  not  a  sub-effect  of  the  FFFE 
a  x  8  x  Ci  however,  the  offset  1  *  3.2,  or  a  x  ,e  ,  is  a  sub-effect 

of  trvat  tt'FE.  *  ”  ** 


In  order  to  illustrate,  in  a  combined  manner,  all  admissibility 
rules  defined,  the  example  of  a  three-way  classification  from  the  beginning 
of  the  present  section  is  fully  discussed,  where  factors  <7,  8,  and  C  have 
3  levels  each  and  where  factor  a  is  qualitative  and  factors  8  and  C  are 
quantitative. 


The  26  DIVs  listed  in  Table  2.4  would  result  from  the  user's 
specification  to  generate  a  model  of  order  D=3.  In  Table  2.4,  7  of  the  26 
DIVs  are  marked  (by  dash  lines)  to  indicate  that  they  have  been  deleted  from 
the  set  by  means  of  Control  Card  No.  4,  assuming  that  the  pattern  of  empty 
cells  does  not  allow  the  fitting  of  these  DIVs.  (See  also  Appendix  A.) 

The  reduced  set  of  26-7  -  19  DIVs  is  given  in  Table  2.5  which  also  contains 
the  grouping  of  the  DIVs  into  effects  and  the  grouping  of  effects  into  FFFEs 
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as  applicable.  (The  symbols  of  PTFEs  contain  only  the  factor  numbers, 
that  is,  they  aprear  as  if  they  were  the  symbols  of  effects  containing 
only  qualitative  factors.  In  fact,  this  is  how  the  PFFKa  are  actually 
treated  in  the  definition  of  a  sub-effect  of  a  FFFE.  The  symbols  for  the 
FFFEs  are  only  used  in  the  present  section.) 

Table  2.6  contains  the  relations  which  exist  among  the  15 
effects  with  respect  to  the  definition  of  a  "sub-effect"  in  restricted 
and  relayed  admissibility.  Clearly,  the  relations  listed  in  Table  2.6 
govern  the  ranking  at  the  very  first  step  of  FEMO  and  an  effect  becomes 
admissible  for  ranking  once  all  those  effects  have  been  ranked  (delwtsd 
from  the  model,  that  is)  of  which  the  effect  was  a  sub-effect  at  the 
first  step. 


DIV  -  Symbol 


1*1 

1*2 

2.1 

2.2 

3.1 

3-2 

1*1  X 

2.1 

1*1  X 

2.2 

1*2  x 

2.1 

1*2  x 

2.2 

1*1  x 

3-1 

1*1  X 

3-2 

1*2  x 

3*1 

1*2  x 

3-2 

2.1  x 

3-1 

2.1  x 

3*2 

2.2  x 

3-1 

2.2  x 

3-2 

1*1  X 

2.1 

x  3.1 

1*1  X 

2.1 

x  3-2  - 

1*1  X 

2.2 

x  3-1 

1*1  X 

2.2 

x  3-2  - 

1*2  x 

2.1 

x  3.1  - 

1*2  x 

2.1 

x  3.2  - 

1*2  x 

2.2 

x  3-1  - 

1*2  x 

2.2 

X  3.2  - 

Table  2.4 


*  ~ 


\  : 

i  I 

l  i 

i  i 

!  i 

i  ■ 

i  ; 


D1V 

Effect 

PFFE 

i  1 

1*1  \ 

1*2  J 

1 

2.1 

2.1 

1 

1 

2.2 

2.2 

3-1 

3-1 

i 

1 

3-2 

3-2 

1 

i 

2*1  x  2.1  ] 
1*2  x  2.1  j 

\  1  x  2.1 

i 

1*1  x  2.2 

1*2  x  2.2  J 

|<  1  x  2.2 

1*1  x  3-1 

1*2  x  3-1 

\  1x3.1  ^ 

j 

lx3 

1*1  x  3-2 

1x3-2  J 

2.1  x  3-1 

2.1  x  3-1 

2.1  x  3-2 

2.1  x  3-2 

2.2  x  3.1 

2.2  x  3-1 

2.2  x  3-2 

2.2  x  3-2 

2*1  x  2.1  x  3. 

,1  lx  2.1  x  3 

1  1 

y  1x2x3 

1*1  x  2.2  x  3 

.1  1  x  2.2  x  3 

1  J 

Table  2.5 
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* 


Restricted 

Admlssibllit: 


Relaxed 

Admlssibllit: 


1 


2.1 


2.2 


3.1 


3-2 


1  x  2.1 


1  X  2.2 


is  a  sub-effect  of  1  x  2.1  1  x  2.1 

1  x  2.2  lx  2.2 

1  x  3.1  lx  3-1 

1  x  J.2  1x3.2 

1  x  2.1  x  3-1  1  x  2.1  x  3.1 

1  x  2.2  x  3.1  1  x  2.2  x  3.1 

is  a  sub-effect  of  2.2 

1  x  2.1  lx  2,1 

1  x  2.2  lx  2.2 

2.1  x  3.1  - 

2.1  x  3.2  - 

2.2  x  3.1  - 

2.2  x  3.2  - 

1  x  2.1  x  3-1  1  x  2.1  x  3.1 

1  x  2.2  x  3-1  1  x  2.2  x  3.1 

is  a  sub-effect  of  1  x  2.2  1  x  2.2 

2.2  x  3-1  - 

2.2  x  3-2  - 

1  x  2.2  x  3-1  1  x  2.2  x  3.1 

1x2x3  (FFFE)  1x2x3 

is  a  sub-effect  of  3.2  _ 


1x3.1  lx  3.1 

1x3.2  lx  3.2 

2.1  x  3.1 

O  T  , -  *  r\ 

X  p.c  - - 

2.2  x  3.1 

2.2  x  3.2  - 

1  x  2.1  x  3.1  1  x 

1  x  2.2  x  3-1  1  x 

is  a  sub-effect  of  1x3. 2  1x3-2 

1x3  (PFFE)  1x3 

2.1  x  3-2  - 

2.2  x  3.2  - 


1x2x3  (FFFE)  1x2x3 

is  a  sub- effect  of  1  x  2.2 

1  x  2.1  x  3.1  l  x  2.1  x  3.1 

1  x  2.2  x  3.1  1  x  2.2  x  3.1 

is  a  sub-effect  of  lx  2. 2x3. 1  1  x  2.2  x  3.1 

1x2x3  (PFFE)  1x2x3 

Table  2.6 
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(Cont  fd) 


CVJ  CO 


Table  2.6  (Cont ’d) 

Restricted 

Relaxed 

Admissibility 

Admissibility 

1x3.1 

is 

a 

sub-effect 

of 

1  X 

3-2 

— 

1  X 

2.1  x 

5-1 

1  x  2.1 

X 

1  X 

2.2  x 

3-1 

1  x  2.2 

x  3-1 

1x3-2 

is 

a 

sub-effect 

of 

1  X 

2x3 

(EFFE)  1  x  2  x 

3 

2.1  x  3-1 

is 

a 

sub-effect 

of 

2.1 

x  3.2 

- - 

2.2 

x  3.1 

— 

2.2 

X  3.2 

— 

1  X 

2.1  x 

3-1 

1  x  2.1 

x  3-1 

1  X 

2.2  x 

3-1 

1  x  2.2 

x  3-1 

2.1  x  3-2 

is 

a 

sub-effect 

of 

2.2 

x  3.2 

_ _ _ 

1  X 

2  x  3 

(EFFE)  1  x  2  x 

3 

2.2  x  3-1 

is 

a 

sub-effect 

of 

2.2 

x  3-2 

1  X 

2.2  x 

3-1 

1  x  2.2 

x  3.1 

1  X 

2x3 

(EFFE) 

1  x  2  x 

3 

2.2  x  3-2 

is 

a 

sub-effect 

of 

1  X 

2x3 

(PFFE)  1  x  2  x 

3 

1  x  2.1  x  3-1 

is 

a 

sub- effect 

of 

1  X 

2.2  x 

3.1 

— 

1  x  2.2  x  3-1 

is 

a 

sub- effect 

of 

NCNE 

NONE 

Also  note  in  Table  2.6  that  an  effect  is  not  listed  as  a 
sub-effect  of  a  IFFE  when  the  effect  is  already  listed  as  a  sub-effect  of 
the  individual  effects  contained  in  the  EFFE.  For  example,  effect  1  is 
a  sub-effect,  under  both  options  of  "restricted"  and  "relaxed  admissibility", 
of  effects  1  x  2.1  x  3.1  and  1  x  2.2  x  3-1  which  two  effects  together 
comprise  the  IFFE  1x2x3  (see  Table  2.5).  When  an  effect  is  a  sub¬ 
effect  of  only  some  of  the  effects  comprising  the  EFFE,  the  effect  is 
listed  as  a  sub-effect  of  the  entire  EFFE  also.  For  example,  effect 
3.2  is  listed  as  a  sub-effect  of  1  x  3-2  and  of  the  PFFE  1x3- 

According  to  Table  2.6,  under  the  option  of  "restricted 
admissibility",  the  effects  would  become  admissible  for  ranking  in  FEMO  as 
follows.  At  the  first  step  of  FEMO,  only  effect  1  x  2.2  x  3.1  would  be 
admissible  and,  consequently,  would  he  ranked  as  the  least  important 
effect.  At  the  second  step,  once  1  x  2.2  x  3-1  has  been  deleted,  only 
1  x  2.1  x  3.1  is  admissible  and  will  be  ranked  as  second  least  important 


33 


effect.  At  the  third  step,  i.e.,  after  deletion  of  the  FFFE  1x2x3, 
effects  1  x  2.2,  1  x  3.2,  and  2.2  x  3.2  become  admissible.  Admissibility 
at  the  fourth  step  depends  upon  which  one  of  the  three  admissible  effects 
of  the  third  step  will  have  been  ranked  (deleted  from  the  model);  and  so 
forth. 


Under  the  option  of  "relaxed  admissibility",  the  ranking, 
at  the  first  two  steps,  would  be  the  same  as  before.  At  the  third  step, 
in  addition  to  the  three  effects  being  admissible  under  "restricted 
admissibility",  the  5  effects  1  x  2.1,  1x3-1,  2.1  x  3. 1,  2.1  x  3.2, 
and  2.2  x  3-1  would  become  admissible;  etc. 

2.2.3  Generation  of  the  Design  Matrix 

The  design  matrix  is  defined  as  the  matrix,  with  n  rows 
and  N+l  columns,  of  the  n  coordinate  values  of  the  N  Independent  variables, 
augmented  by  a  column  vector  of  n  l's  for  the  constant,  xo  «  1.  Speaking 
in  terms  of  the  model  (2-1),  n  is  the  number  of  observations  of  the  dependent 
variable,  y.  The  number  N  of  IVs  is  the  sum  of  the  number  (T)  of  CIVs  and 
the  number  (N-T)  of  DIVa.  This  implies  that  the  presently  discussed 
generation  of  the  design  matrix  refers  to  the  final  set  of  IVs  which  enter 
the  analysis. 


The  CIV-part  of  the  design  matrix  is  generated  as  follows. 

For  each  of  the  n  observations  of  y  there  is  one  observation  each  for  the 
TP  OCIVs.  The  set  of  observations  on  the  TP  OCIVs  (for  each  value  of  y) 
are  put  on  Data  Card  No.  3,  see  Section  3«1«1,  and  enter  as  such  (but 
normally  coded  for  reasons  of  the  accuracy  of  the  matrix  inversions,  see 
below)  into  the  design  matrix.  The  GC IV- values,  being  powers  and/or 
products  of  OCIV-values,  are  computed  by  the  program,  according  to  the 
specifications  given  by  the  user,  and  then  enter  the  design  matrix.  For 
example,  if  there  are  TP=3  OCIVs,  the  set  of  the  three  numerical  QCIV 
observations,  for  one  selected  y  value  frcra  the  total  of  n  observations,  may 
be  15,  2,  and  -1.1.  If  there  is  a  GCIV  in  the  model  with  symbol,  say, 

1(2)  x  2(3)  x  3(1),  (x^kaXg  in  the  usual  notation),  the  program  will  assign 
to  it,  aB  a  covariate  value  for  the  one  selected  y  observation,  the 
numerical  value  15a  x  2*  x  (-1.1)1  =  225  x  3  x  (-1.1)  =  -1980. 

In  case  the  program  user  chooses  the  option  for  coding  the 
OCIV  values  (see  columns  20-23  of  Control  Card  No.  1,  Section  3. 1.1)  this 
coding  will  be  done  by  NOVACCM  according  to  the  formula 


v  . 


CRy 


v  =  1, . . . ,TP;  i  =  1, . . .  ,n 


(2-15) 


3*+ 


M 


where 


—  1  " 

Xy  C  —  Z  Xu  .  • 

n  i=1^« 


Ry  =  MAX(Xy, )  -  MINfxy,  ),  i  =  1, . 


and  where  C  is  an  arbitrary  constant  usually  chosen  as  C=l.  (For  a  dis¬ 
cussion  of  this  coding,  see  Abt  et  al.  [19 66],  Section  VII. 2. a.,  and 
Section  3.I.3.  of  the  present  report.)  For  example,  if  n»6,  and  if  the 
CCIV  Xi  has  the  values  13,  6,  2,  18,  10,  and  3,  one  would  have 

xi  =  7  (15  +  6  +  2  +  18  +  10  <  3)  =  9 
6 

Ri  ■  18  -  2  «  16 

and  the  first  coded  value  would  read  (if  C=l): 

_ 1 


Xll 


1-16  8 


0.375. 


The  program,  under  the  coding  option,  uses  the  coded 
values  to  generate  the  GO IV  values. 

The  DIV-part  of  the  design  matrix  is  generated  from  the 
information  contained  in  the  cell- identification  which  iB  given  as  input 
for  each  one  of  the  n  y-ohservations,  see  the  1st  Data  Card,  Section  3.1.1. 

The  cell  identification  consists  of  the  level  numbers  of  the  cell  to  which 
the  y-value  corresponds.  For  instance,  in  the  example  of  Table  2.5 
(Section  2.2.2)  the  coll  defined  by  ar»l,  P=3,  and  y=l  has  the  cell  identifi¬ 
cation  "131."  In  accordance  with  the  derivations  in  Section  2,1.1,  NOVACCM 
assigns  values  to  the  DIVs  of  first  order  (i.o.,  to  DIVs  representing  main 
effects)  and  then  generates  all  DIVs  of  order  d  >  1  by  multiplication. 

A  DIV  of  order  1  for  a  qualitative  factor  is  assigned,  as 
numerical  value,  a  1  when  the  level  number  of  the  DIV  equals  the  corresponding 
level  number  in  the  cell  identification.  If  the  two  level  numbers  are 
unequal,  the  DIV  is  assigned  the  value  zero.  In  the  example  of  Table  2-5, 
yi3i,  as  mentioned,  corresponds  to  cell  131.  Accordingly,  for  this  obser¬ 
vation  of  y,  DIV  1*1  receives  a  1,  DIV  1*2  receives  a  zero. 

A  DIV  of  order  1  for  a  quantitative  factor  is  assigned  the 
numerical  value  of  that  level  whose  number  is  given  in  the  cell  identification, 
and  the  level  value  is  raised  to  the  power  of  the  DIV.  (The  level  values 
of  the  quantitative  factor  variables  are  input  on  Control  Card  No.  6,  see 
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Section  3.1.1.)  For  instance.  In  the  exanple  of  Table  2.3  the  numerical 
values  of  the  levels  of  (quantitative)  factor  Ho.  2  may  be  -0.13,  -0.01, 
and  -cO.SC.  For  this  value  y^j,  then,  DIV  2.1  la  assigned  the  numerical 
value  0.P0,  and  DIV  2.2  is  aa signed  the  value  (0.20)*.  If  the  three 
levels  of  (.nientitatlve)  factor  Ro.  3  are  -0.50,  +0.10,  and  +0.45,  say, 
DIVs  3<1  and  3-2  are  assigned,  for  again  observation  yul,  the  values 
-O.5O  and  (-O.50}9,  respectively. 

As  example  for  the  assignment  of  numerical  values  to  DIVs 
of  order  d  >  1  by  multiplying  the  values  of  the  respective  DIVs  of  order 
1,  see  the  following  table  of  selected  DIVs  (all  DIV- values  again  for 
observation  yi3i): 


Assigned  Value 


DIV 

for  Observation 

1*1  x  2.1 

1*0.20 

-  0.20 

1*2  x  2.1 

0-0.20 

=  0 

1*1  x  2.2 

1-  (0.20)8 

-  0.04 

1*2  x  2.2 

o^o.ao)3 

=  0 

1*1  x  3.1 

i- (-0.50) 

=  -0.50 

1*1  x  2.2  x  3.1  l-(0.20)»-.(-0.50)  -  -0.02 


The  coding  option  of  NOVACCM  applies  also  to  the  values 
of  the  quantitative  factor  variables,  and  the  coding  formulae  are  similar 
to  (2-15),  for  example,  for  quantitative  factor  variable  one  has: 


where 


and 


b  =  i,...,b 


EX*9 


\  =  MAX(X*#)  -  WHOC^). 


(2-16) 
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For  instance,  if  X*  has  the  B*4  levels,  say,  10,  23,  37,  and  82,  Xb  will 
be  38  and  R*  will  be  72.  The  first  level  would  then  have  the  coded  value 
( IO-38 )/72  =  -O.3889.  Under  the  coding  option,  DIVs  of  order  d  >  1  are 
generated  by  NCVACCM  using  the  coded  values  of  the  DIVs  of  order  1. 

The  generated  design  matrix  plus  the  n  observations  each 
on  the  possibly  up  to  4  different  dependent  variables,  y,  form  the  "dBta 
matrix."  From  the  data  matrix  the  program  generates  the  matrix  of  the 
sums  of  the  cross-products  ("summation  matrix")  for  the  N+l  IVs  (including 
the  column  vector  of  l’s)  and  the  dependent  variable(s).  For  the  algebraic 
representation  of  the  summation  matrix  see  Herring  [1967]. 

The  summation  matrix  consists  of  the  matrix  ("A")  of  the 
normal  equations  with  rank  N+l  and  the  column  vectors  whose  elements  are 
the  cross-product  terms  with  y. 

2.3  The  Ranking  Subroutines  of  NOVACCM 

In  this  section  the  backward  ranking  subroutines  CCMO  and  FEMO 
are  described  in  detail  for  the  ranking  of  CIVs  and  of  groups  of  DIVs, 
respectively.  For  simplification,  the  matrix  A  of  the  normal  equations 
for  the  model  containing  all  N  IVs  is  assumed  to  have  been  successfully 
inverted,  that  is,  the  first  step  in  the  ranking  process  is  also 
considered  to  be  the  first  "good  step."  The  consequences  of  deviations 
from  this  assumption  can  be  seen  without  difficulty  by  following  the  flow 
charts  given  in  Section  2.4.1.  (In  the  flow  charts,  CCMO  and  FEMO  are 
given  in  loop  representations.  This  and  the  fact  that  the  possibility  of 
rejected  models  is  included  in  the  flow  charts  account  for  some  differences 
in  the  notations  used  in  the  present  section  and  in  the  flow  charts.)  When 
the  matrix  inversions  of  one  or  more  steps  in  CCMO  or  FEMO  had  to  be 
rejected  on  the  grounds  of  the  accuracy  criteria  imposed,  the  principal 
methods  of  CCMO  and  FEMO,  as  outlined  below,  remain  unchanged. 

2.3.1  CCMO 


First,  the  option  for  only  cumulative  dropping  of  CIVs 
will  be  described,  then,  the  option  for  the  additional  single  dropping 
procedure.  The  description  of  COMO  ("concomitant  variables  Magnitude  [of 
prediction  power  for  y]  Ordering")  is  given  in  terms  of  a  general  step 
No.  h,  where  h  =  1,2,..., T.  (Since  T  CIVs  are  assumed  in  the  model,  the 
total  number  of  steps  in  CCMO  is  identical  to  the  total  number  of  CIVs  in 
the  model.)  At  Step  No.  h,  b-1  CIVs  will  have  been  ranked,  that  is,  will 
have  been  dropped  from  the  model  and  are  no  longer  contained  in  the  set 
ox'  the  IP  IVs;  see  the  Main  Theorem  in  Section  2.1.2.  The  dropping  of 
CIVs  from  the  model  is  synonymous  with  the  deletion  of  the  corresponding 
rows  and  columns  from  the  matrix  A  of  the  normal  equations.  It  is  also 


37 


assumed  that  the  program  user  has  decided  upon  one  of  the  two  options  for 
the  admissibility  of  CIVs:  ''restricted"  or  "unrestricted  admissibility", 
as  previously  discussed. 


Step  No.  h  of  CCMO,  cumulative  dropping; 

(1)  Determine  the  admissible  CIVs  of  this  step  (No.  h). 

(2)  Invert  the  matrix  A  of  the  normal  equations  of  this  step  with 
rank  Ntl-(h-l)  =  N-h+2.  (if  h=l,  this  is  the  matrix  of  the  full  model 
with  rank  N+l  containing  the  constant,  N-T  DIVs,  and  T  CIVs.  If  h  >  1, 
this  is  the  matrix  from  which  the  rows  and  columns  corresponding  to  the 
h-1  CIVs  previously  ranked  have  been  deleted. ) 


(3) 


where 


Compute,  for  all  admissible  CIVs  (xv  'b)  of  this  step,  the  terms 
SS(h.v)  -  ^ 


cU) 

Sv 


k(») 


e.(  h  > 

Lv  v 


=■  regression  coefficient  of  x^,  at  step  h, 

=  main  diagonal  element  (corresponding  to  xv )  of  the 
inverse  matrix,  A-1,  with  rank  N-h+2. 


(O  Find,  for  all  admissible  CIVs  ^  ,  MIN{SS(h,v)J  =  SS(h,-),  and 
denote  the  CIV  for  which  S3(h,v)  is  minimum  x*  .  The  CIV  \  is  the  hth 
least  important  CIV.  Store  its  symbol  for  the  Final  Comprehensive  Analysis. 


(',)  Compute 

h 

SS(1)U)  =  S  SS(i,-) 

i  -1 

*DF(i)<  h )  =  h 


SS(2)(h>  ■=  ATSS  -  ASSR(N)  =  SS(2)(1>  =  const. 
*DF(2)<h)  =  n-N-1  =  DF(2)(1)  =  const. 


*COEFF  DEI*  h  ) 


h-1 

ASSR(N)  -  L  3S(  i,  - ) 
i=l 


AICS 
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and  define 


*DIFF  MS{  h)  -  SS(h, - ) 

*DIFF  DF<  -  1  -  const., 

where  the  notation  used  is  that  of  the  Main  Theorem,  Section  2.1.2. 

Store  the  terms  marked  by  asterisks  for  the  Final  Comprehensive  Analysis. 

(6)  Using  the  terms  in  (5),  compute 

sstiiLil 

DF(1)(  h) 

on/  h  ) 

*MS(2)(h>  »  DF(2)^T  “  =  const. 

"  > 


.k*»,  msui, 

2  2 

where 

^  h  5  =  - - - ; — ~  is  the  value  denoted  as  x.  in  (2-12). 

!  .  ^(Ij  - 

Sii(S  jO  h  > 

Stoic  the  terms  marked  by  asterisks  for  the  Final  Comprehensive  Analysis. 

(7)  Go  to  Step  No.  (h+l)  by  replacing,  in  the  above  computations 
(1)  through  (6),  the  index  h  by  h+l. 

Final  Comprehensive  Analysis  of  CCMO,  cumulative  dropping. 

The  Final  Comprehensive  Analysis  (FCA)  of  CCMO,  cumulative  dropping, 
contains  for  each  step  the  9  values  marked  by  asterisks  in  (5)  and  (u) 
above.  Also  in  the  FCA,  each  step  is  identified  by  its  number,  and  the 
symbol  of  the  CIV  is  given  which  was  ranked  ut  this  step.  There  is  one 
more  column  in  the  FCA,  marked  "PRC",  in  which  aa  asterisk  is  printed  for 
that  3tcp  ol'  CCMO  when  l(X)  s;  a  for  the  first  time  occurs  where  a  may 
assume  up  to  throe  different  specified  input  values  so  that  asterisks 
may  possibly  be  printed  at  3  different  steps  of  COMO,  cumulative  dropping. 
For  the  use  of  the  FCA  see  Section  3-3.1. 


Msa)<h) 

MS(2)<hi 
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COMO,  single  dropping. 


In  the  following,  the  single  dropping  procedure  of  CCMO  is  deacrlbed 
In  the  terms  of  the  cumulative  dropping  procedure  given  before.  (The 
general  step  number,  h,  is  the  same  ae  in  CCMO,  cumulative  dropping.  In 
the  flow  charts,  see  Section  £.4.1,  in  the  aingle  dropping  procedure  of 
CGMO,  the  general  atep  number  ia  denoted  for  clarity  as  "p.")  When  the 
program  user  chooses  the  option  for  "cumulative  and  aingle  dropping", 
(column  24,  Control  Card  No.  1,  see  Section  J.l.I),  the  CCMO  single 
dropping  procedure  will  be  performed  In  addition  to  the  cumulative 
procedure.  In  other  worua,  the  cumulative  dropping  procedure  ia  alwayn 
executed.  CCMO,  aingle  dropping,  ueee  the  ranking  order  of  the  CIVs 
established  by  the  cumulative  ranking  procedure.  See  also  Section  2.1.2 
for  a  more  extensive  discussion. 

Step  No.  h  of  CCMO.  finals  dropping: 

(1 )  Compute 

3S(l)U)  -  88(h, - ) 

*DF(l)*h*  ■  1  «■  const. 

h-1 

S3(2)<fc>  «  ATSS  -  ASSR(N)  +  E  SS(i,-) 

i-1 

*DF(2)(h>  -  n-N+h-2 


h-1 

ASSR(N)  -  E  SS(1,-)  _ 

*CCEFF  DET(h)  =  - .S5l-V/  ", 

ATSS  ATSS 

and  define 

* DUT  MS1  * '  -  SS(h, -) 

*DIFF  Hf<*>  -1 

Store  the  terms  marked  by  asterisks  for  the  FCA. 

(2)  Use  the  terms  in  (l)  and  compute  MS(l)tfc),  MS(2)(h),  F*h)  and 
I(X)  h  as  shown  in  (6)  of  Step  h,  CGMO,  cumulative  dropping,  and  store 
them  for  the  FCA. 

(3)  Go  to  Step  No.  (h+1)  by  replacing  in  the  above  computations  (1) 
and  (2),  the  index  h  by  h+1. 
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The  Final  Ccmprehena ive  Analyaia  of  CCMO,  single  dropping, 
corresponds  In  all  features  to  the  FCA  of  CCHO,  cumulative  dropping. 
Therefore,  no  Ibrther  discussion  is  necessary. 

2.3-2  FEMO 

After  terminating  CCHO  (if  applicable),  the  program  goea  to 
the  subroutine  FEMO  ("Factorial  Effects  Magnitude  [of  prediction  power  for 
y3  Ordering")  for  the  ranking  of  factorial  effects  represented  by  groups 
of  DIVs.  As  was  discussed  in  Section  2.1.2,  the  model  of  the  first  step 
of  FEMO  includes  the  significant  CIVs  from  CCMO  if  there  were  significant 
CIVo  according  to  the  a-value  No.  KALPHA,  •  The  step  of  CCMO  at 

which  1(X )  s.  for  the  first  time  is  referred  to,  in  FEMO,  as  ho- 

See  also  paragraphs  (2)  and  (b)  below. 

The  description  of  FEMO  is  given  in  terms  of  a  general 
step  No.  k,  where  k  »  1,2,...,  number  of  step  at  which  last  effect  is 
ranked.  The  dropping  of  a  factorial  effect  frem  the  model  1b  synonymous 
with  the  deletion,  (from  the  matrix  A)  of  the  rows  and  columns  which 
correspond  to  the  DIVa  in  the  group  of  DIVs  representing  the  factorial 
effect. 


The  following  definitions,  which  correspond  to  those 
used  in  COMO,  are  used  in  the  formulation  of  FEMO:  The  admissible 
effects  at  the  k*-*1  step  of  FEMO  arc  defined  by  the  arguments  (k,i), 
where  i  -  1,2,...,  is  the  set  of  admissible  effects  at  this  step.  The 
argument  (k,i)  is  used,  for  example,  in  SS(k.l)  =  Additional  Regression 
Sum  of  Squares,  at  the  k^h  step,  due  to  that  group  of  DIV3  which  represent 
admlssibl'  ■.  ffoct  "i."  The  computation  of  SS(k,l)  in  paragraph  (3)  below 
i3  as  given,  for  example,  in  Hader  and  Grandage  [19:j$].  The  term  DF(k,i) 
stands  for  "Degrees  of  Freedom"  of  the  effect  with  argument  (k,i),  or: 
ef  effect  (k,i),  and  is  equal  to  the  number  of  DIVs  renresoniing  effect 
(k,i). 


If  FEMO  was  preceded  by  a  CCMO,  the  same  option  regarding 
the  admissibility  as  was  chosen  for  CCMO  is  applicable  for  FEMO  when 
quantitative  factors  are  contained  in  the  ANOVA  model:  "restricted 
admissibility"  of  factorial  effects,  or  "unrestricted  admissibility" 
which  here  means  "relaxed  admissibility."  Otherwise  (assuming  there  was 
no  CCMO),  it  is  supposed  that  the  program  user  has  decided  upon  one  of  the 
two  options.  The  option  for  only  cumulative  dropping  of  groups  of  DIVs 
will  be  described  first. 
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Step  No.  k  of  FD40,  cumulative  dropping: 

( 1)  Determine  the  admissible  effects  (k, i). 

(2)  Invert  t  e  matrix  A  of  1  he  normal  equations  with  rank  Mk ,  where 

k-1 

M*  =  N+2  -  ho  -  Z  DF(i,-)  if  there  were  significant  CIVs  (ho  £  T) 

d=i 


Mk  =  N+l  -  T  -  Z  DF(j,-)  if  there  were  no  significant  CIVs  or 
j=l  no  CIVs  at  all  (T*0). 


(Note:  The  argument  (,j,-)  denotes  one  of  the  factorial  effects  ranked  by 
FEMO  prior  to  Step  No.  k.  See  paragraph(6)  below.) 

The  above  implies  the  inversion  of  a  matrix  which  results  from 
the  original  (N+l)x(N+l)  matrix  of  the  normal  equations  by  deleting  (a)  the 
(ho-1)  or  T  rows  and  columns  corresx^jpping  to  the  (ho-1)  or  T,  respectively, 
non- significant  CIV3  and  (b)  those  |^DF{j,-)  rows  and  columns  which 
correspond  to  the  DIVs  representing  the  effects  ranked  at  the  previous 
k-1  steps.  That  is,  the  matrix  with  rank  contains,  if  applicable,  the 
T-ho+1  rows  and  columns  corresponding  to  the  significant  CIVs  from  COMO. 

(  3)  Compute  for  all  admissible  effects  (k.i)  of  this  step  the  values 


SS(K,i)  =  fV. k  >  b'k>  . . .  b<k  ) 
i  ‘a  lv 


lcikt  cill 

1 1 1 1  1 1 l3 

...  c<  10  . . 

.  CU) 

1 11  s 

-1 

b<k> 

1  1 

c<  K  )  c(  k  ) 

1  a 1 1  1  a  5  a 

...  C(  K  ) 

‘a>v 

.  C'  k  ) 

18‘0 

b<k> 

1  a 

c<  k  >  c<  k  ) 

Vi  lvla 

. . .  c(  k  )  . . 

V  V 

.  r(k) 

'-I  i 

V  D 

• 

r'o'i  Va 

... 

l  t 

0  V 

•  CC.k'  I 

*0  0/ 

Kl 
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where 


1^  =  subscript  indicating  one  of  the  DIVs  which  represent 

effect  (k,i),  with 

v  =  1,2...., D,  where  D  =  DF(k, i) 

b|k  *  =  regression  coefficient  for  DIV  No.  iy  at  Btep  No.  k 

ciki  .  =  element  of  inverse  matrix  with  rank  M*  for  row 
v  v  corresponding  to  DIV  No.  iy  and  for  column  corresponding 
to  DIV  No.  iy.  . 

(k)  Compute,  for  all  admissible  effects  (k,i)  of  this  step,  the  values 
k-1 

SS(l)<k*n  =  E  SS(J,-)  +  SS(k,i), 
i=i 

k-1 

DF(l)(k.n  =  s  DF( j , - )  +  DF(k,i) 

J-l 

ho-1 

SS(2)(k>  =  SS(2)(1>  =  ATSS  -  ASSR(N)  +  E  SS(i,-)  if  there  were 

i=l  ho-1 

significant  CIVs  (h0  s;  T),  where  £  SS(i,-)  is 

i=l 

tak  "rom  CCMO: 


SS(2)(k>  -  SS(2)Cl>  =  ATSS  -  ASSR(N)  +  E  SS(i,-)  if  there  were 

i=l 

no  significant  CIVs  or  no  CIVs  at  all  (T=0). 

DF(2)(k5  =  DF(2)(1>  =  n-N-2  +  ho  for  first  case  above  (ho  £  T); 

DF(2)^k^  =  PF(2)^1^  =  n-N-1  +  T  for  second  case  above. 

(5 )  Using  the  terms  from  above,  compute 

.  Kxo.n,  S£|H!i,  Pnip-11, 


where 


x(k,n 


ss(l)(,t-  15 


SS(2)<k) 


I 


;  i 


I 


i 


(6)  Find,  for  the  admissible  effects  (k,i)  of  this  step 
MAX[l(X)(k' =  I(X)(k,“'  and  let  the  kth  least  important  effect  (k,-) 
be  defined  by  this  equation,  if  MAX[l(X)(k’ 1  >  Co  “  1C®.  (The 

numerictii  value  of  the  constant  lCo  has  been  chosen  as  1CT®  in  accordance 
with  the  computational  accuracy  of  the  I(X)- subroutine.  Note  that,  if 
JttXT  l(X)tk’ **]  <  Co,  the  ranking  procedure  must  have  advanced  well  into  the 
significant  model  3ince  any  chosen  significance  level  or  will  be  larger  than 
10®.)  In  this  case,  i.e.,  if  MAX£I(X/  k»  *  >  1  >  Co,  compute  and/or  store 
the  following  terms  for  the  Final  Comprehensive  Analysis  of  FEMO: 


(a) 

(b) 

(c) 

(d) 

(e) 

(f) 

(g) 

(h) 

(i) 

(j) 


If  MAX[I(X)(k- 


Symbol  of  effect  (k,-) 
I(X)(|,)  =•■  l(X)(k*_) 


DIFF  MS*  k  5  =  SS(k,-)/DF(k,-) 
DIFF  DI<k5  a  DF(k,-) 

DF(l)(k)  =  DF(l)(k*-) 


IlF(2)(k1 

MS(l)(k) 


SS(l)(k’~) 

DF(l)lk) 


.su£i 

DF(2)tkl 

*00  -  MS(l)(k> 

fe(2)Ck  5 


COEFF  DET*k) 


or: 


ASSR(Mfc-l) 

ATSS 

k-1 

T.  SS(j,- 
J=1 


:  [assb(n)  -  jW-)  . 

if  there  were  significant  CIVs 
(ho  s  T) 


-  J_  [*381,00 
ATSS  ATSS  L 


T 

E  SS(i,- 
i=l 


)  - 


k-1 
E  SS 
0=1 


if  there  were  no  significant  CIVs 
or  no  CIVs  at  all  (1=0) , 


s  C0,  go  to  the  "  + -procedure"  as  outlined  below. 


1+4 


i 


i 

i 


! 

j 

! 

i 

I 


(7)  Go  to  Step  No.  (k+1)  by  replacing,  in  the  above  computations 
(i)  through  (6),  the  index  k  by  k+1. 


The  "  * -procedure"  (Step  k*  )  of  FEMO,  cumulative  dropping: 


This  modification  of  FEMO  (the  "  + -procedure" )  will  apply  only  when 
MAX[ i(X)^ k •  1 5 ]  s  C0,  where  the  superscript  of  l(X)  may  also  read  (k+  +  5,  i), 
(k++  +  e,  i),  etc.,  see  further  below.  In  the  + -procedure,  the  terms  which 
were  computed  at  Step  No.  k  as  described  above  are  used.  Therefore,  the 
* -procedure  is  also  referred  to  as  "Step  k+."  The  + -procedure  serves  to 
increase  the  l(X)-vaiues  in  order  that  the  remaining  factorial  effects  in 
the  significant  model  may  be  ranked  with  respect  to  their  relative 
::  -rtance.  This  is  achieved  by  pooling  all  previously  ranked  effects 
in* o  the  experimental  error,  that  is,  by  a  redefinition  of  the  model, 
as  follows: 


(1)  Compute  and/or  define 
SS(l)<k  +  'l>  =  SS(k,i) 
DF(l)(k  +  ,n  =  DF(k,i) 


k-1 


SS(2)(k+>  =  SS(2)(1>  +  E  S3(j,-) 

j=l 


k-1 


DF(2)(k+‘>  =  DF(2)cn  +  2  DF(j,-) 

J-l 

Using  the  above  four  terms,  compute  the  values  l(X)(k+,1)  as  in 
paragraph  (5)  of  Step  No.  k. 


(2)  Find,  for  the  admissible  effects  (k,i)  cf  this  step  (which  are 
u;'.'  same  as  in  Step  k): 


MAX[l(X)(k  +  ’‘>]  =  I(X)(k  +  '“  >  . 

11'  this  maximum  is  greater  than  C0  =  1CT8  ,  let  the  kth  least  important 
ciTect  (k,-)  be  defined  by  this  equation.  In  this  case,  compute  and/or 
sto^e,  for  the  FCA,  terms  (a)  -  (j)  as  given  in  paragraph  (6)  of  Step  k, 
replacing  the  index  k  by  k+ .  In  the  FCA,  print  the  symbol  "  +"  in  the 
PRC  column  for  this  step.  Then  go  to  3tep  (k+  +6),  starting  with  5=1, 

—  outlined  below.  In  case  of  MAXfI(X)Ck  • 1  > ]  <:  C0,  go  to  the  "  ** -procedure 
an  outlined  further  below. 
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With  6  =  1,2,...,  this  is  a  general  step  after  the  + -procedure  had 
to  be  applied.  The  experimental  error,  which  was  redefined  at  Step  k+  , 
remains  again  constant,  and  the  sums  of  squares  due  to  the  effects  ranked 
are  pooled  again,  as  seen  in  paragraph  (2)  below. 


(1)  Determine  the  admissible  effects  (k  +  6,  i). 


(2)  After  carrying  out  the  computations  similar  to  those  of  paragraphs 
(2)  and  (?)  of  Step  No.  k,  compute  and/or  define,  for  the  admissible  effects 
(k  +  6,  i)  of  this  step: 

+  ,  k+6-1 

SS(l)(k  +0.0  ^  E  SS(j,-)  +  SS(k  +  5,  i) 
j=k 

+  ,  k+6-1 

DF(l)<k  +6.0  =  s  DF(J,-)  +  DF(k  +  6,  i) 

J=k 

SS(2)<  k++®^  =  SS(2)<k+) 

DF(2)<k%6)  =  DF(2)(k+5 


where  the  latter  two  right-hand  terms  are  from  paragraph  (i)  0f  Step  k*  . 
Using  the  above  four  terms,  compute  the  values  l(xfk  +  ’ 1  ^  as  in  paragraph 
(5)  of  Step  No.  k. 

(3)  Find,  for  the  admissible  effects  (k  +  6,  i)  of  this  step, 

M^q-i(X)(»*+«,0]  =  i(x)u++6*-> . 

If  this  maximum  is  greater  than  Co,  let  the  (k  +  5)^  least  important  effect 
(k  +  6,  -)be  defined  by  the  above  equation.  In  this  case,  compute  and/or 
store,  for  the  FCA,  terms  (a)  -  (j)  as  given  in  paragraph  (6)  of  Step  No.  k, 
replacing  the  index  1,  cy  (k+  +6).  Then  go  to  Step  No.  (k+  +  6  +  1)  as 
outlined  above,  i.e.,  by  replacing  the  index  6  by  (6+1).  If  MAX[I(X)]  s 
Co  *  1CT8,  go  to  Step  (k*  +  4 )+ ,  i.e.,  follow  the  procedure  as  outlined 
in  Step  k*  ,  replacing  the  index  k  by  (k+  +5). 


-procedure" 


of  FEMO,  cumulative  drop 


This  modification  of  FEMO  (the  "  ** -procedure" )  will  apply  only  when 
MAX!"I(X)(  k  +  ’  1 )  ]  s  Co  =  1C8,  where  the  superscript  of  I(X)  may  also  read 
[(k+  +  6 )+ ,  i],  [(k++  +  e)*,  i],  etc.,  see  further  below.  The  ++ -procedure 
is  also  referred  tc  as  "Step  k++."  The  aim  of  the  ** -procedure  is  to 
further  increase  the  l(X)-values  (which,  at  Step  k+ ,  still  were  all 
below  Co  -  10' 8 )  so  that  the  remaining  factorial  effects  in  the  significant 
model  may  be  ranked  with  respect  to  their  relative  importance.  For  this 
purpose,  at  Step  k*+  ,  the  sum  of  squares  due  to  one  of  the  admissible 
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effects  is  added  to  the  error  sum  of  squares  according  to  the  definition 
in  paragraph  (l)  below.  Later,  in  the  computation  of  I(X),  an  'F-value1 
according  to  (2-10),  see  Section  2.1.2,  is  implied  which  contains  this 
very  same  sum  of  squares  in  both  the  numerator  and  the  denominator.  So, 
actually,  it  is  not  an  F-value,  but  the  computational  procedure  of  I(X) 
is  employed  nevertheless  in  order  to  have,  also  in  this  caBe  and  if 
applicable,  a  ranking  criterion  by  which  to  establish  the  relative 
importance  of  the  (highly)  significant  factorial  effects. 

The  +*  -procedure  is  as  follows. 

(l)  Find,  among  the  admissible  effects  of  Step  No.  k,  and  using 
the  terms  SS(k, i)  from  Step  No.  k. 


MIN[MS* k  *  n]  =  MIN  [~33(2)(k  l.  — „  MS(k'0)  . 

‘  ‘  '-DP(2)<*‘  >  +  DF(k,i)-l 

The  above  equation  defines  the  effect,  wit ich  minimizes  MS(  k  • 1  \  by  the 
argument  (k,0). 


(2)  Compute  and/or  define 


SS(1)(I 


M)  - 


SS(k,i) 


DF(l)(k++* l)  =  DF(k, i) 

SS(2)(k++)  =  SS(2)(k+)  +  SS(k,0) 

DF(c.)(k++)  =  DF(2)(k  +  )  +  DF(k,0) 

Using  the  above  four  terms,  compute  the  values  I(X)(k  as 

paragraph  (5)  of  Step  No.  k. 

(5)  Find,  for  the  admissible  effects  (k,i)  of  this  step 
still  the  same  as  in  Step  k): 


in 

(which  are 


MAX[l(X)(k  +  +  -l)]  ^  l(X)Ck  +  +'_>. 


If  this  maximum  is  greater  than  Co>  let  the  k^h  least  important  effect 
(k,-)  be  defined  by  this  equation.  (Note  that  effect  (k,-)  will  not 
necessarily  be  equal  to  (k,0).)  In  this  case,  compute  and/or  store  for 
the-  FCA,  terms  (a)  -  (j)  as  given  in  paragraph  (6)  of  Step  k,  replacing 
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the  index  k  by  k** .  In  the  FCA,  print  the  symbol  "  ++"  in  the  HlC  column 
for  this  step.  Then  proceed  to  Step  {k++  +  e),  starting  with  c=l,  as 
outlined  below.  If  M|tX[l(X)(k  • 1 5  ]  s.  Co,  proceed  to  the  "  ++‘,‘  -procedure" 
as  outlined  further  below. 

Step  No.  (h**  +  € )  of  FEMO,  cumulative  dropping: 

With  s  =  1,2,...,  this  is  a  general  step  after  the  ++  -procedure  had 
to  be  applied.  The  error  sum  of  squares  is  now  defined  to  consist  of  the 
sum  of  squares  of  the  original  error  sum  of  squares  pooled  with  the  sums 
of  squares  due  to  all  effects  ranked  before  and  at  Step  No.  k.  For 
e  =  1,2,...,  the  error  sum  of  squares  remains  constant  again.  From  Step 
No.  K*"f+1  on,  the  sums  of  squares  due  to  the  effects  ranked  are  pooled 
again,  as  seen  in  paragraph  (2)  below. 

(1)  Determine  the  admissible  effects  (k  +  «,  i). 

(2)  Af'  er  carrying  out  the  computations  which  are  similar  to  those 
of  paragraphs  (2)  and  (3)  of  Step  No.  k,  compute  and/or  define,  for  the 
admissible  effects  (k  +  e,  i)  of  this  step: 


SS(l)Ck++*€*  1 ) 

k+c-1 

=  E  SS(j,-)  +  SS(k  + 
j=k+l 

DF(l)(k+++€*  1 ) 

k+e  -1 

-  E  DF( j,-)  +  DF(k  + 
0=k+l 

SS(2)(k+++e> 

k 

=  SS(2)(k>  +  E  SS(j,-) 
0=1 

DF(2)(k+++*> 

k 

=  DF(2)(k)  +  E  DF(j,-) 

0=1 

Using  the  above  four  terms,  compute  the  values  l(X)*k  as  in 

paragraph  (5)  of  Step  No.  k. 

(3)  Find,  for  the  admissible  effects  (k  +  e,  i)  of  this  step, 

MAX[l(X)(k+++e>  =  l(X)(k+++e'->  . 

1 
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If  this  maximum  is  greater  than  C0>  let  the  (k  +  c)th  least  important 
effect  (k  +  c ,  -)  be  defined  by  the  above  equation.  In  this  case,  compute 
and/or  store,  for  the  FCA,  terms  (a)  -  (j)  as  given  in  paragraph  (4)  of 
Step  No.  k,  replacing  the  index  k  by  K1-*  +  e-  Then  go  to  Step  (k++  +  c  +  1) 
as  outlined  above,  i.e.,  by  replacing  the  index  «  by  (*+l).  If 

I(X)C  «•  »>]  s  c0,  go  to  Step  (tf*  +  «)*,  i.e.,  proceed  as  in  Step  , 
replacing  the  index  k  by  (k+,f  +  e). 

The  "  ***  -procedure"  (Step  k*"*"*- )  of  FEMO,  cumulative  dropping: 

This  last  modification  of  FEMO  will  apply  only  when 
MAXfl(X/k  • 1  ^  1  Co  where  the  superscript  of  l(X)  may  also  read 

[tk*  +  fi)++,  i],  [  (k+4'  +  e )++ ,  i],  etc.  The  +++ -procedure  is  also 
referred  to  as  "Step  k+++."  The  aim  of  the  procedure  is  to  define,  for 
Step  No.  k  of  the  ranking,  a  least  important  effect  after  even  the 
*+ -procedure  failed  to  increase  the  l(X)-values  sufficiently. 

In  this  case  merely  define  effect  (k,0)  from  Step  k++  to  be  effect 
(k,-).  Then  compute  and/or  store  for  the  FCA  the  terms  (a)  -  (j)  as  given 
in  paragraph  (6)  of  Step  k,  replacing  the  index  k  by  K*'++  .  In  the  FCA, 
print  the  symbol  "  +++"  in  the  PRC  column  for  this  step.  Then  proceed 
tc  Step  No.  (k+'f  +  c),  starting  with  «=1,  as  was  described  before. 

FEMO,  single  dropping. 

Finally  for  FEMO,  the  single  dropping  procedure  will  be  discussed. 

"Single  dropping"  is  executed  in  addition  to  FEMO,  cumulative  dropping, 
when  the  appropriate  option  is  chosen  (column  2k  of  Control  Card  No.  1, 
see  Section  3.1.1).  As  was  correspondingly  the  case  for  CCMO,  single 
dropping,  the  single  dropping  procedure  of  FEMO  uses  the  ranking  order  of 
the  factorial  effects  established  by  the  cumulative  dropping  procedure. 

The  single  dropping  procedure  of  FEMO  then  simply  consists  of  the 
"  + -procedure"  described  before,  which  is  followed  all  the  way  through, 
from  the  first  to  the  last  step.  At  appropriate  places,  the  terms  computed 
in  the  cumulative  dropping  procedure  are  used  for  the  computations  and/or 
for  the  Final  Comprehensive  Analysis,  FEMO,  single  dropping.  Since  the 
ranking  order  has  already  been  established  in  the  cumulative  procedure, 
the  single  dropping  option  never  needs  go  into  the  **  -  or  -procedure*. 

Final  Comprehensive  Analyses  of  FEMO. 


The  Final  Comprehensive  Analyses  for  both  the  cumulative  and  the 
single  dropping  procedure  in  FEMO  correspond  to  those  described  for  CCMO. 
As  was  mentioned  at  the  appropriate  places,  the  symbols  "  "  **",  ang 


i  +++n  ftre  prin^e,j  ^  the  prc  column  when  the  corresponding  procedure  led 
to  the  ranking  of  the  effect  for  vhlch  the  symbol  la  printed.  Also  in  the 
PRC  column,  an  asterisk  is  printed  whenever  l(x)  is  smaller,  for  the  first 
time,  than  one  of  the  possibly  up  to  three  or-signiflcance  levels  used  as 
input,  thus  marking  the  " significant  model"  which  corresponds  to  the 
respective  or-level. 

When  there  are  two  or  more  sets  of  Control  Card  No.  4,  i.e.,  when 
the  data  of  a  given  classification  has  been  analysed  by  fitting  two  or 
more  models  respectively,  all  FCAs  of  PD40  are  repeated  at  the  end  of 
the  problem  in  order  to  facilitate  the  search  for  the  "moat  probable 
significant  model."  (See  also  Section  3.3-2.) 

2.4  Computational  Flow 
2.4.1  Flowcharts 


In  thla  section  the  computational  flow  of  NOVACCM  is 
given  in  the  form  of  logical  flowcharts  where  these  flowcharts  reflect 
only  the  method  of  the  program  and  are  not  expressed  in  the  terms  of  a 
programming  language.  Sene  features  which  were  diBcuBsed  in  previous 
sections,  like  the  determination  of  the  admissible  CIVs  or  effects,  are 
not  contained  in  the  charts.  Wherever  it  is  considered  necessary  for  the 
understanding  of  the  flowcharts,  comments  are  provided  whien  are  lifted 
in  Section  2.4.2.  The  flowchart  boxes  for  which  comments  are  given  in 
Section  2.4.2  have  been  marked  by  decimal  classification  numbers  of  which 
the  first  is  the  number  of  the  chart  and  the  second  is  the  box  number 
within  the  chart. 
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la  this  an  attempted  CCKO? 


Was  there  an  attempted  CCMO 
(without  any  good  step*}! 


Set  g  =■  g-T 


Define  the  admlaaihle  effect*. 
("Restricted"  or  "Relaxed  AdmJ aaibllity" 
option,  as  chosen.)  Delate  ightmost" 
admissible  effect  with  mlniraui  degrees 
of  freedom.  Define  this  effect  as 
"(C,-)"  (“ith  DF(g,-)  degrees  of  free¬ 
dom)  and  print  its  symbol  in  the  FCA. 


Define  the  admissible  CIVs. 

("Restricted"  txr  "Unrestricted  Admissibility" - 
option,  as  chosen. )  Delete  "rightmost" 
admissible  CIV,  calling  it  x, .  Print  the 
symbol  of  x,  in  the  FCA. 


Is  g  *  T? 


Is  I  DF(  j , - )  -  N-T? 
J-l 


Is  the  option  for  both  cumulative  and  single 
dropping  chosen? 


the  oi'ion  irr  loth  -urulatlve 
sing!-.  Jropping  chosen? 


Following  the  heading  "FCA  CCMO 
SINGIX  DROPPING" ,  print.:  "NO 
PRINTOUT  FCF  COHO  S^'GLE  DROPPING 
SINCE  NO  VALID  SUMS  f  SQUARES  WERE 
CCHFUTED." 


following  the  heading 
"FCA  FWO  SINGLE  DROPPHIG" 
print:  "NO  PRINTOUT  TOR 
FEMO  SUM  If  DROPPr:;  SINCE 
NO  VALID  O'JMS  OF  a;UAKf.r. 
WERE  CfKHJTED." 


Will  there  be  a  FOfO? 


M,-DF(g,-) 


r.-'t  g  -  r+i 
anil 

M,  =  M..1  =  VI 


6  1  (Oh.  ) 


7  )  <Ch.  -.) 


Chart  4 


5 


% 
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Step  h  = 


of  CCMO,  cumulative  dro 


Define  the  admissible  CIVs.  ("Restricted"  or  "Unrestricted 
Admissibility"-option,  as  chosen.) 

Print  the  numbers  of  the  admissible  CIVs. 

Compute  the  t^/c^  for  all  admissible  CIVs. 

Find  MIfl[bg/cyv]  =  SS(h,-)  and  define  corresponding  CIV  for 
v 

later  deletion  from  the  model.  h-1 

Compute:  ASSS(M*-1-Tl)  -  AS3R(N-h'+l)  -  £  SS(i,-) 

i*h 1 

Compute  ( and  store*  for  FCA) :  CCEFF.  DET.  -  -i—  A3SR(I^-1-11 ) 

AT88 

Compute  nnd/nr  define  (and  store  for  FCA): 


esd)*"'  *-  r  3s(i,-l 

l»h' 

DF(1)(I°  =  h-h'+l 

MS(1)(‘>  =  ss(i)<0/df(1)(0 

SS(2)(°  =  ATSS-ASSR(IV-l) 

DF(2)(  * 1  »  n-M*  «  n-N-2+h' 

DIFF  MS*°  =  SS(h,-) 

DIFF  DF<°  =1 


Is  IFt-!)*0  »  07 


Compute  a  nr'  store  for  FCA: 

MS(2)**5  »  MS(C)*  °  «  S8(2)(0/DF(2)to 
r*°  .  MS(1)(°/MS(2)( ° 

I(X)<° 

Print  the  l(X)-viilues  and  their 
arguunenta  at  into rnu-d late  steps. 


Zero  error  perfect  fit  at  h  =  h*  (but 
the  following  values  apply  also  at  all 
subsequent  steps): 

Set,  in  FCA: 

KS(2)(  =  0 

=  0^999999 
I(X)<»’)  =  0 

Set  /KSfl)**''  -  0 
Set  {stand,  dev.  of  b^  ’s]  e  0 


(Ch.  Y) 


Chart  6 
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|k  !•  to  the  CIV  free)  Ui<-  mold 
i|. ■!  Ino'l  before  to  be  deleted. 


Will  there  he  t  FtMOt 


No 


Ye* 


11.1 


Wan  the  rv-ni^iil l  icanre  level  No.  KAXfHA  reached 
(by  I ( X ) )  at  n  M.e|i  t\,  -  h‘+(^  i  ft 


H.c 


No 


Co«iMt€  ( Alia  store  for  F^O)! 

t  *) 

T 

33(2) 

.  AT03-AS3h(M»-l)<  T.  S3 ( 1 . -  ) 

1  1.  * 

DF(2)*k> 

»  AlSS-ASSK(ll-T) 

-  n-l-(N-T) 

Yea 


>1.5 


Restore  the  model  of  step  containing 
dll  N-T  DlVe  end  T-1\>  +  1  CIVa.  Coeipute 
(end  etore  for  Tfyo)  • 

I  ,  lb  -1 

SS(P)U1  •  ATS8-ABSR(IV-1)+  t  BS(  t.  - ) 

1-h1 

,  ■  ATSfl-AS5R(N-l\,*l) 

liF(2)‘  fc’(  -  n-N-2»l\, 


i'  int  rCA  umuiit  l/-  ;i«  |  ;  u^. 


If.  th»*  t.ptlon  for  lint  h  - -ui:ni  1  «* t  ! 
•  ni  nlri^’l**  l!t  |.|‘iiir 


Will  II 


KW  • 


(_):  )  (j)i i- )  (o)  >) 

Chart  i 


■,3 


Chart  i 


6o 


M"Ln*  S8(l)*»*''>  •  M{k,t) 

w(i)**  .  cor( h , i >  k.j 

88(8)“*'  «  M(S)**'>*  C 

J-fc* 

.  .  k-l 

CPU')**  >  -  OT(2)**>»  t  DT(J,-) 

J-h1 

Co*«>ut«  th*  I(X)< •** 1  >  And  find  Ml*  I(X)* 
Print  th*  I(X)-ral\i*i  tad  tb*Lr  iriuiati 


!•  HAJ([  I(*)3  a  C0  -  KT*t 


Bo 

12.1 _ 

••-Erpc«C\tf<-  («t*I 
'.'nmputf  N8I,,,, 


(?)<**  Ug 


DP(2!1'  >tDr(k,l) 

rmd  moi  -  Hltf  ^ 

1  *  W(i)<**>»I»(k1o> 

Drfln*  88(1)*  ***' * '  -  88(k,t) 

•  W(h,l) 

Coaput*  ••Ceg ■** >  -  88(2)J**)*fl3(k,o) 

»(?)*“  ’  »  DF(2)*'  •♦DF(k,o) 

Conputf  th*  KX)***’-"  and  find  MAX(  l(X)*  ***•  1  >  ] 
Print  th*  I(X)«valu*f  and  th*lr  arguMnti. 


i  'll.  i* ) 


Chart  12 


M 


Delete  effect  (k,-)  defined  previously 


Is  I  DF(j,-)  =  N-T? 

d-1 


No 


Set  u)  =  o>+l  and  Invert  the 
matrix  with  rank: 

k-1 

My  =  Mk>  -  £  DF(J,-). 

J=k’ 

Compute: 

ASSR  =  ASSR(My-l) 


Yes 


Print  FCA,  FB40, 
cumulative  dropping 


Is  the  option  for  both  cumulative 
and  s..r^le  dropping  chosen? 


Yes 


©  (ch.  11)  Q2i)(ch.  15) 
Chart  3 .4 
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No 


© 


(Ch 


2.4.2  Ccoments  on  Flowcharts 


The  following  comments  refer  to  the  flowcharts  and  to  the 
boxes  in  the  flowcharts  of  the  previous  section,  where  the  flowchart 
numbers  and  box  numbers  are  as  given. 

Chart  1. 

Box  1.1  The  uncoded  OCIV  values  are  printed  in  order  to  provide  a 
possibility  to  check  the  input.  If  coding  is  requested,  the  ranges  and 
averages  (and  the  range  factor  C)  are  printed  to  facilitate  the  back- 
transformation  of  output  values  if  desired  and  feasible. 

The  bar  charts  (in  regression  only)  of  the  uncoded  OCIV  values  are 
printed  to  give  a  possibility  of  visual  checks  on  the  distributions  of 
the  values,  i.e.,  on  their  approximate  normality  and  on  outlying  values. 

Box  1.2  See  comments  Box  1.1. 

Box  1.3  The  ''Full  Data  Matrix"  is  the  (N+W)  x  n  matrix  of  the  n  values 
(coded,  if  applicable)  of  the  N  IVs  and  the  W  dependent  variables  (W  s;  4). 

It  is  this  matrix  from  which  the  (N+l)  x  (N+l+W)  "Summation  Matrix"  is 
generated  by  the  program.  The  summation  matrix  is  the  (N+l)  x  (N+l)  matrix 
of  the  coefficients  of  the  normal  equations  augmented  by  W  s  4  row  vectors 
containing  the  cross  products  with  the  y's.  (In  most  problems,  there  will 
be  only  one  dependent  variable,  y,  and  the  summation  matrix  will  consist  of 
the  matrix  of  the  normal  equations  augmented  by  one  row  vector  containing 
the  cross  products  with  y. )  The  ATSS-values  (one  each  for  each  dependent 
variable)  are  the  total  sums  of  squares  (of  y)  adjusted  for  the  mean.  The 
ATSS  are,  naturally,  ecfual  for  all  C.  C.  4  sets  if  there  are  several  such 
sets . 

Chart  2. 

Box  2.1  Thi3  applies  when  the  problem  is  one  of  regression  only. 

Box  2.2  The  additional  analyses  of  variance  ("ANVAs")  are  essentially 
given  in  the  form  of  FHiO-FCAs.  The  ANVAs  are  counted  and  printed  (1) 
for  each  dependent  variable  for  which  significant  IVs  were  found  in  an 
analysis  of  covariance  and  (2)  for  each  OCIV  which  is  significant  or  is 
a  3Ub-CIV  of  a  significant  GCIV.  In  addition  to  the  FCAs,  the  symbols  of 
the  admissible  effects  and  all  computed  l(X)-values  and  their  arguments 
are  printed  for  each  step  of  each  ANVA.  See  also  Sections  2.1.1,  5.5.3 
and  Example  6  in  Section  3.4.6. 
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Box  2.3  The  "Final  FCA"  is  &  reprinting  of  the  individual  FCAs  for 
each  Control  Card  4  set.  The  Final  FCA  enables  the  program  user  to  compare 
more  easily  the  results  from  the  various  Control  Card  4  sets  and,  thereby, 
facilitates  the  search  for  the  "most  probable  significant  model."  8ee 
also  Section  3-3.2  and  Example  5  in  Section  3 -4. 5- 

Chart  3 • 

Box  3-1  Step  No.  g  is  a  general  step  of  the  ranking  procedure  (CCMO 
or  FEMO)  before  an  acceptable  inverse  of  the  matrix  A  of  the  normal  equations 
is  found.  See  also  Section  2.1.3- 

Box  3-2  Although  the  inverse  is  rejected,  A“l ,  a»  and  the  by's  are 
printed  in  order  to  give  the  user  an  impression  of  the  amount  of  the 
inaccuracy. 

Box  3.3  The  same  reasoning  as  given  for  Box  3*2  applies  here.  "IEENT. 
MATRIX  DIAG.  FT KM.  NE  l"  is  short  for  "At  least  one  Identity  matrix  diagonal 
element  is  not  equal  to  1,  given  the  tolerance  T0LI(2)." 

Box  3-4  The  value  ASSR  (Regression  sum  of  squares  adjusted  for  the 
mean)  is  given  for  a  rejected  first  step  of  an  attempted  CCMO  or  FEMO 
because  it  will  enable  the  user  to  check  the  amount  of  the  computational 
inaccuracy  in  case  of  a  perfect  fit.  Namely,  in  a  perfect  fit  for  any 
data  classification  into  "cells",  ASSR  should  equal  ".he  sum  of  squares 
"between  cells",  where  the  latter  can  be  calculated  ’by  hand." 

Chart  4. 

Box  4.1  If  there  was  an  attempted  CCMO  without  any  good  steps, 
g=g-T  makes  the  step  numbering  in  FEMO  start  with  "1." 

Box  4.2  The  "rightmost"  admissible  effect  to  be  deleted  here  is 
"rightmost" with  respect  to  the  position  of  the  DTVs  representing  the 
effect  in  the  model.  Since  the  generation  of  the  model  is  such  that  the 
highest -order  interactions  and  the  highest  powers  of  the  quantitative 
factor  variables  are  located  "rightmost"  these  are  the  effects  deleted 
first  after  FEMO  steps  have  been  rejected.  (Note  that  there  cannot  be 
statistical  criteria  by  which  to  delete  the  effects  from  the  model  in 
this  case  of  a  rejected  model. )  The  "minimum  degrees  of  freedom" 
condition  (applicable  in  FEMO  only)  serves  the  purpose  of  reaching  an 
acceptable  model  under  the  smallest  loss  of  degrees  of  freedom  possible. 

The  present  way  of  deleting  effects  may  not  be  the  fastest  one  to  arrive 
at  an  acceptable  model.  In  case  of  a  singular  matrix  A,  for  example, 
there  may  be  only  one  IV  which  causes  the  singularity,  but  this  IV  may 
not  necessarily  be  deleted  at  the  first  step  of  an  attempted  COMO  or 
FEMO.  See  Example  7  in  Section  3. 4. 7. 


67 


Box  4.5  Since  there  were  no  accepted  FEMO  steps,  there  is  no  basis 
for  a  FEMO,  single  dropping  procedure.  {The  single  dropping  procedure 
takes  the  ranking  order  from  the  cumulative  procedure,  see  Section  2.3.2.) 

Box  4.4  The  corresponding  comments  as  given  for  Box  4.2  apply  also 
here,  except  for  the  "mini’^um  degrees  of  freedom"  condition  since  in  CCMO 
only  individual  IVs  are  d<  ted. 

Box  4.5  See  corresponding  comments  on  Box  4.3* 

Chart  5 . 

Box  5.1  h'  and  k1  are  the  numbers  of  the  "first  good  step"  in  CCMO 
and  FEMO,  respectively.  Note  that,  according  to  the  cannon  matrix  A,  all 
steps  of  FEMO  will  be  accepted  when  h'  was  reached  in  C(MO. 

Box  5.2  The  rank,  Mb<  ■  N+2-h’,  of  the  matrix  of  the  normal  equations 
for  the  first  good  step  of  CCMO  is  the  difference  of  N+l  (for  N  IVs  of  the 
original  model  and  the  constant,  xo)  and  h'-l  (for  the  h'-l  CIVs  deleted 
prior  to  the  first  good  step). 

Chart  6. 

Bex  6.1  For  more  details  on  the  ranking  of  CIVs  by  the  cumulative 
dropping  procedure  of  COMO,  see  Section  2.3.1. 

Box  6.2  When  DF(2)=0,  one  deals  with  a  "zero  error  perfect  fit", 
which  can  be  reached  only  at  the  first  good  step  of  the  ranking  procedure. 
Naturally,  MS(2),  F,  and  I(X)  have  to  be  defined  in  this  case  and  cannot 
be  computed.  Since  BF(2)=>DF(2)^  *'  *  remains  constant  throughout  CCMO, 
cumulative  dropping,  the  definitions  of  Box  6.2  apply  at  each  of  the 
remaining  steps  of  CCMO,  cumulative  dropping. 

Chart  7. 

Box  T.l  If  there  was  a  "  41 ",  "  ++",  or  "  "-procedure  in  any  of  the 
previous  FEMO  steps,  this  means  l(X)  had  been  smaller  than  Cc  =  1CT8 ,  that 
is,  smaller  than  any  significance  level  or  specified  by  the  user.  Therefore, 
no  full  printout  will  be  given  (in  cumulative  or  single  dropping)  beyond 
this  step  of  FEMO. 

Box  7.2  The  asterisks  printed  in  the  PRC  column  of  the  FCA  indicate 
clearly  the  steps  of  the  ranking  procedure  where  the  significant  models 
corresponding  to  the  up  to  three  specified  significance  levels  a  have 
been  reached.  Note  that  the  asterisk  is  also  printed  when,  in  CCMO,  a 
zero  error  perfect  fit  was  reached  at  this  step.  This  is  because  the 
zero  error  perfect  fit  is  by  definition  the  first  good  step,  and  as  such  . 
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leads  to  an  l(X)  value  of  zero  (also  by  definition)  which  necessarily  is 
smaller  than  any  specified  a-value.  Therefore,  only  one  full  printout 
will  be  given  in  case  of  a  zero  error  perfect  fit.  This  printout  is  at 
the  same  time  that  of  the  first  good  step. 

Box  7*3  The  option  to  print  the  by  's  at  every  step  of  the  ranking 
is  provided  to  supply  the  user  with  some  information  for  the  intermediate 
stepB  whore  no  full  printouts  are  given.  The  regression  coefficients,  bv , 
which,  for  example,  in  FEMO  are  the  estimates  of  the  individual  model 
parameters,  are  considered  to  be  the  most  important  numerical  values. 

Box  7- 4  The  "full  printout"  is  similar  to  that  given  in  the  program 
DA-MRCA  (Abt  et  al.  [1966]).  The  Chi-square  test  computations  for  testing 
the  normality  of  the  distribution  of  the  residuals  are  exactly  like  in 
DA-MRCA.  The  "Residual  or  Error  Sum  of  Squares",  the  "Check  error  sum  of 
squares"  and  the  "Square  root  of  (the)  residual  variance"  are  specifically 
computed  for  the  step  at  which  they  are  printed.  That  is,  the  "Residual 
or  Error  Sum  of  Squares"  equals  ATSS  -  ASSR(N')  when  the  model  of  the 
given  step  contains  N'  IVs.  In  the  single  dropping  procedure,  one  has 
ATSS  -  ASSR(N')  =  SS(2)  where  SS(2)  is  the  value  which,  if  divided  by 
DF(2),  gives  MS(2)  as  printed  in  the  FCA.  (See  Section  3-5*1* )  A 
detailed  general  formulation  of  the  "full  printout"  in  NOVACCM  is  given 
in  Herring  [1967).  See  also  Section  2.1. h  of  the  present  report  and 
Example  1  in  Section  J.U.l. 

Chart  8. 


Box  3.1  The  or-significance  level  No.  KALPHA  is  the  one  (specified 
by  the  user)  which  determines  the  significant  CIVs  to  be  kept  in  the 
analysis  of  covariance  model  when  ranking  the  factorial  effects  by  FEMO. 
See  also  Section  2.1.2  and  Control  Card  No.  1,  column  25  (Section  3. 1.1) 

Box  8.2  Since  all  T  CIVs  are  deleted  from  the  model,  FEMO  will 
operate  on  an  analysis  of  variance  model  only. 

Box  8.3  The  significant  CIVs  will  be  kept  in  the  model,  therefore, 
FEMO  will  operate  on  an  analysis  of  covariance  model. 

Chart  9- 


P°v  ''*•  1  The  single  dropping  procedure  starts  with  the  model  which 
was  that  of  the  first  good  step  of  the  cumulative  dropping  procedure. 

Box  9. 2  For  a  more  detailed  de script  on  of  the  single  dropping 
procedure  in  COMO,  see  Section  2.3- 1. 


« 
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Box  9.3  The  zero  error  perfect,  fit  l.s  the  same  as  that  reached  in 
COMO,  cumulative  dropping.  (See  Box  6.2  in  Chart  6.)  However,  since  in 
single  dropping  the  degrees  of  freedom  corresponding  to  all  previously 
ranked  CIVs  are  accumulated  in  DF(2),  at  Step  (p*l)  one  will  have  DF(2)  >  0 
and,  consequently,  MS(2),  F,  and  l(X)  can  he  computed  from  Step  (p+1)  on. 

Box  For  the  case  of  a  perfect  fit  (zero  error  perfect  fit,  that 
is)  see  the  comments  on  Box  f.2  in  Chart  7. 

Chart  10. 

Boxes  10. 1  and  10.2  See  comments  on  Boxes  8.2  and  8.3,  respectively, 
in  Chart  d. 

Chart  11. 

Box  11.1  For  a  more  detailed  description  of  FEMO,  cumulative  dropping, 
see  Section  2. 3. 2.  FEMO,  cumulative  dropping,  is  presented  in  Charts  11 
through  14  in  loop  form  which  is  more  concise  than  the  detailed  manner  in 
which  FEMO  is  described  in  Section  2.3.2.  The  various  modifications  of 
FEMO,  cumulative  dropping,  (the  "  "  +4‘",  and  "  procedures)  in 

addition  to  the  basic  procedure,  can  be  summarized  as  follows: 

No  11  *"  at  all:  SS(l)  and  DF(1)  are  due  to  the  group  of  all 
previously  ranked  (deleted)  effects  plus  the  effect  presently 
searched  for.  SS(2)  and  DF(2)  are  those  of  the  first  good  step  of 
FEMO  and  remain  constant  thereafter. 

"  SS(1)  and  DF(1)  are  due  only  to  the  effect  presently 

searched  for.  SS(2)  and  DF(2)  are  due  to  the  group  of  all  previously 
ordered  effects.  At  the  first  step  following  the  "  + "-procedure,  the 
pooling  starts  anew  for  S5(l)  and  DF(l),  but  33(2)  and  PF(2)  remain 
constant. 

"  **":  (This  procedure  is  always  preceded  by  the  "  + "-procedure.  ) 
SS(l)  and  UF(l)  are  due  only  to  the  effect  presently  searched  for. 

SS(2)  and  DF(2)  are  due  to  the  group  of  all  previously  ordered  effects 
plus  effect  "(k,0)".  At  the  steps  following  the  "-procedure,  effect 
(k,0)  is  replaced,  in  SS(2)  and  DF(2),  by  effect  (k^*,-);  and  SS(2) 
and  DF(2)  remain  constant  from  Step  (k+++l)  on.  Also  at  Step  (Kf++l), 
SS(l)  and  DF(1)  are  due  only  to  the  effect  then  seerched  for.  Frcm 
Step  (k+++l)  on,  pooling  starts  anew  for  SS(l)  and  DF(l). 

"  *** ":  (This  procedure  is  always  preceded  by  the  "  +4"'-procedure. ) 
Here,  effect  "(k,0)"  of  the  "  ** "-procedure  takes  the  place  of  effect 
"(k++*,-)."  Otherwise,  the  "  +*+ "-procedure  is  as  the  the  "  *+"- 
procedure. 

Bex  11.2  If  DF(d)=0,  the  data  used  leads  to  a  zero  error  perfect 
fit  in  FEMO.  Since  there  is  no  basis,  in  this  case,  to  rank  the  factorial 
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effect*  by  the  l(X)-criterion  (the  computation  of  l(X)  require*  an  error 
cum  of  squares  SS(2)  >  0),  FS40  cannot  be  executed.  The  program,  therefore, 
atop*  and  then  goes  to  the  next  dependent  variable  or  CC  4  set.  In  order 
to  avoid  the  atop,  the  program  u«er  must  provide  for  DF(2)  >  0  (which  will  ' 
imply  SS(2)  >0)  at  the  first  good  step.  He  may  do  ao  by  deleting  one  or 
more  of  the  factorial  effect*. 

Chart  12. 

(See  comments  given  for  Box  11.1  In  Chart  11.) 

Box  12-1  The  reason  for  defining  effect  "(k.O^1  a*  the  one  which  1* 
to  be  pooled  Into  the  error  sum  of  squares,  88(2)'*  > ,  1*  that  (k,0)  1* 

the  effect  which  should  reasonably  be  defined  a*  the  "least  important 
effect"  at  this  step  in  case  the  "  *** "-procedure  beconer.  necessary.  By 
previously  using  (k,0)  lji  the'^-procedure  (one  affect  ha*  to  be  defined 
for  pooling  Into  8S(2)(*  M),  no  additional  computatic©*  are  necessary 
In  case  of  the  "  **'*"' -procedure. 

Charts  13  and  14. 

(See  coeaents  given  for  Box  H.l  in  Chart  11.) 

Chart  15. 

Box  15.1  For  a  more  detailed  description  of  tbs  F1M0,  single  dropping 
procedure,  see  Section  2.3.2.  Note  that,  since  In  nMO,  single  dropping, 
the  ranking  order  of  the  effects  is  taken  fran  the  cumulative  procedure, 
there  is  no  need  for  the  "  or  "  procedure. 


U3E  OF  NOV AC  CM 


3.1  Input  Preparation 

3-1.1  Control  Cards  and  Data  Cards 


In  this  section  the  Input  preparation  for  NOVACCM  is 
discussed  as  far  as  the  control  cards  end  data  cards  are  concerned.  The 
consequences  of  the  choices  of  the  various  options  and  the  UBe  of  the 
options  are  described  in  Sections  3.1.2  and  3.1.3* 

The  cards  are  described  below  in  the  order  of  input.  The 
deck  of  the  identification  card,  the  6  types  of  control  cards  and  the  3 
types  of  data  cards  comprise  a  "problem  deck."  An  arbitrary  number  of 
problem  decks  may  be  stacked  one  deck  after  the  othsr*  NOVACCM  will 
perform  all  the  problems  in  that  given  order.  An  end  of  file  card  at  the 
end  of  one  problem  deck  will  terminate  the  NOVACCM  computations. 

Identification  Card 


This  card  contains  an  00  column  problem  identification.  The 
information  on  this  card  is  completely  at  the  discretion  of  the  user. 


Control  Card  Mo.  1 


ColumnB 

Program 

Variable 

Format 

Description 

Range 

1-2 

E 

12 

The  number  of  factors.  Zero  or 
blank  when  regression  (CCMO)  only; 

In  which  csss,  columns  4-11  are  not 
used. 

0-99 

3 

Nonacv 

11 

The  number  of  dependent  variables. 
Zero,  blank  or  1  implies  one 
dependent  variable. 

0-4 

4-5 

D 

12 

The  order  up  to  which  the  program 
will  automatically  generate  DIVs. 

Zero  or  blank  when  DIVs  are  to  be 
put  in  entirely  by  means  of  CC  No.  4, 

0-6  and 

D  £  E 

( 

GDD 

A1 

GDD  =  G  -  generate  DIVs  described 
on  CC  No.  4  and  include  these 
DIVs  with  any  which  may  have 
been  sutomat ically  generated, 

=  D  -  delete  the  DIVs  described 
on  CC  No.  4  from  those  which 

G,  D,  or 
blank 
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Columns 

few.'  mm 

Format 

Description 

Range 

have  bsen  automatically 
generated, 

ODD  -  blank  -  neither  generate 

nor  delete  DIVa  by  means  of 

CC  No.  4. 

7-9 

NCC4 

13 

The  muaber  of  DIVa  to  be  generated 
or  deleted.  Blank  or  aero  when  ODD 
«  blank.  (NCC4  *  constant  for  all 
NSCC4  seta;  of  CC  No.  4,  see  columns 
10-11.) 

Generation: 

0-139 

Deletion: 

0-255 

10-11 

NBCC4 

12 

The  number  of  sets  of  CC  No.  4. 

Zero,  blank,  or  1  when  one  set  only. 

0-99 

12-13 

TP 

12 

The  number  of  OCIVs.  Zero  or  blank 
when  analysis  of  variance  (FEMO) 
only;  in  which  case,  columns  14-19 
are  not  used. 

0-99 

iU-lt) 

P 

12 

P  ■  1  if  OCIVs  only,  or  OCIVs  and 
additional  hand-generated  uCIVs 
(by  means  of  CC  No.  5). 

1  <  P  <  6  if  OCIVs  to  be  automati¬ 
cally  gsneratad  up  to  power aura 

P,  plus  poeslbly  hand-generated 
or  deleted  OCIVs  (by  means  of 

CC  No.  5). 

P  -  0  or  blank  only  if  OCIVs  (TP>0) 
to  be  put  in  entirely  by  means 
of  CC  No.  !>.  (An  unusual  situ¬ 
ation.  ) 

(Note.  GCIVs  of  pcversum  7  to  21 
may  be  hand-generated  by  means 
of  CC  No.  5  only. ) 

0-6 

16 

GDC 

A1 

GDC  <=  G  -  generate  CIVs  described  on 
CC  No.  5 , 

=  D  -  delete  CIVs  described  on 

CC  No.  9  from  the  automati¬ 
cally  generated  set  of  CIVs, 

■  blank  -  no  CIVs  to  be  gen¬ 
erated  or  deleted  by  means  of 
CC  No.  5. 

G,  D, 
blank 
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CC  No.  1  (Cont’d) 


Columns 

Program 

Variable 

Format 

Descript J  on 

Range 

17-19 

KCC5 

1.5 

The  number  of  CCs  No.  5,  l.e.,  the 
number  of  CIVs  to  be  deleted  or 
hand-generated.  Blank  or  zero  when 
GDC  blank. 

Generation: 

0-159 

Deletion: 

O-25U 

20 

CODE 

11 

CODE  «  0  -  code  the  OCIVs  and  the 
quantitative  factor  level 
values.  For  coding,  the  form 

0,1 

*■  x'  is  used,  with 

-  1  n 

*  «  i  [  x  and  R  -  max(x)  - 

min(x)  and  C  as  specified  in 
columns  21-23  below,  n  is  the 
total  number  of  observations 
for  OCIV  coding  or  is  the 

number  of  level  values  of  & 
factor  for  quantitative 
factor  level  coding. 

■*  1  -  do  not  code. 

21-23 

C 

F3.0 

The  coefficient  of  R  in  x'  (aee 
column  20)  for  OCIV  coding.  (Cal 
for  quantitative  factor  level 
coding. ) 

C  ■  zero  or  blank  has  the  atune 
effect  as  C  •  1.0. 

2h 

DROP 

11 

DROP  e  Zero  or  blank  -  cumulative 
dropping  only. 

“  1  -  cumulative  and  single 
dropping. 

0,1 

25 

KALPIIA 

11 

The  cardinal  number  of  the  ALPHA 
value  which  value  (in  COMO)  will  be 
used  as  the  significance  level  for 
the  inclusion  of  CIVs  in  the  FEMO 
model.  In  case  of  FEMO  only,  the 
program  ignores  thlB  column. 

0-5 

7*» 


Columns 


Program 

Variable 


Format 


Description 


Range 


20-29 

ALP11A(1) 

- 

Fl4.l1 

First  significance  level  for  CCMO 
and  FEMO 

.0001-1.0 

50-33 

ALPHA(2) 

F4.4 

Second  significance  level  for  CCMO 
and  FIMO. 

0.0-.9999 

34-37 

ALPHA(3) 

F4.4 

Third  significance  level  for  COMO 
and  FIMO. 

0.0-.999^ 

NOTE:  Those  values  should  be  in 
descending  order.  The  program 
uses  only  the  first  non-iero 
entries. 

3.1 

CAD 

11 

CAD  «  0  -  use  restricted  admissi¬ 
bility  rules  for  ranking  In 
COMO  and  FEMO. 

>  0  -  use  unrestricted  admissi¬ 
bility  rules  for  ranking  in 
CCMO  and  FEMO  (i.e.,  relaxed 
admissibility  rules  for  FIMO 
when  both  qualitative  and 
quantitative  factors  arc 
present) . 

39-Mi 

T0LI2 

Eo.2 

A  tolerance  which  is  used  to  chock 
the  main  diagonal  elements  of  the 
identity  matrix  formed  from  the 
product  of  the  matrix  of  the  normal 
equations  with  its  Inverse.  If  these 
diagonal  elements  deviate  from  1  by 
an  absolute  value  leas  than  the  value 
of  T0LI2  then  the  inverse  is  con¬ 
sidered  acceptable. 

45 

IRCO 

11 

IRCO  =  0  -  do  not  print  the 

regression  coefficients  at 
every  step  of  NOVACCM. 

-  1  -  print  the  regression 
coefficients  at  r\ cry  step. 

0,1 

*40 

ADA 

11 

ADA  =  0  -  do  not  perform  additional 

ADA  =  0  -  do  not  perform  additional 
analyses  of  variance  ("ANVAs") 


CC  No.  1  (Cont'd) 


Columns 

Program 

Variable 

Format 

Description 

Range 

ADA  >  0  -  perform  additional  analyses 
of  variance  ("ANVAs")  for  the 
dependent  variable(s)  after 
exclusion  of  all  CIVs  from  the 
model  and  for  each  significant 
OCIV  and  for  each  OCIV  con¬ 
tained  in  a  significant  GCIV. 

Control  Card  No .  2 


(Optional  -  omit  when  regression  only:  E  =  0.) 


Column: 

1  2 

3  4 

5  6 

-4 

OD 

9  10 

11  12 

! 

!  3 

!  3 

!  3 

- V 

1 

< 

i  1 . 

1  i  1 

Factor  No. : 

1 

2 

3 

4 

5 

6 

E  £ 

CC  No.  2  gives  the  number  of  levels  of  each  factor.  Example  given 
above:  Factors  1,  2,  and  3  have  3  levels  each. 

With  two  columns  per  factor  there  may  be  enl  ries  for  40  factors  per 
card,  the  maximum  number  of  levels  per  factor  being  99.  Use  a  second  card 
if  there  are  more  than  40  factors  and  also  a  third  card  if  there  are  more 
than  30  factors.  Entries  are  read  with  on  12  format. 


Control  Card  No.  3 

(Optional  -  omit  when  regression  only:  F,  =  0. ) 


Column: 

1  2 

3  4 

5 

678 

9  10 

!  2 

-  1 - 

-  i-J 

1. .  1  „ 

CC  No.  3  gives  the  factor  numbers  of  the  quantitative  factors  using 
2  columns  for  each  factor.  Example  given  above:  Factors  Nos.  2  and  3 
quantitative. 

Blank  card  (or  cards)  when  there  are  no  quantitative  factors  (i.e.,  all 
factors  are  qualitative).  Entries  are  read  with  an  12  format. 
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Control  Card  No.  4 


(Optional  -  omit  when  E  =  0,  or  GDD  =  blank  and  NCC4  =  0. ) 


Column: 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

n 

12 

13 

14 

15 

16 

17 

18 

n 

0 

M 

□ 

□ 

n 

n 

n 

[71 

n 

n 

2 

□ 

□ 

n 

Factor  Pair:  j 

.  j 

c - - 

_ d 

First 

i  Second 

Third  1 

Each  CC  No.  4  gives  a  DIV  to  be  deleted  from  the  automatically- 
generated  set  or  to  be  generated  (possibly  in  addition  to  the  automatically 
generated  set  of  order  D).  The  number  of  CCs  No.  4  equals  the  number  of 
DJVs  to  be  deleted  or  generated.  (The  number  of  CCa  No.  4  ia  given  in 
columns  7  to  9  of  CC  No.  1. )  There  will  be  1  DIV  symbol  per  card  of 
CC  No.  4  format.  Providing  2  digits  for  each  factor  number  and  for  each 
level  number  or  power,  the  symbol  for  a  DIV  of  first  order  will  occupy 
5  columns;  a  DIV  of  second  order  will  occupy  10  columns,  etc.,  (5  columns 
for  each  "factor  pair").  The  maximum  order  for  a  DIV  is  6.  Example  above: 
DIV  1*1  x  2.1  x  3.2. 

There  may  be  several  sets  of  CC  No.  4,  each  containing  the  same 
number  of  cards  =  number  of  DIVs.  The  number  of  sets  of  CC  No.  4  is  given 
on  CC  No.  1,  columns  10-11.  Each  set  of  CC  No.  4  means  a  separate  analysis 
and,  therefore,  meens  a  separate  f.-'nal  comprehensive  analysis  (FCA).  If 
there  is  more  than  one  set  of  CC  No.  4,  then  the  program  repeats,  as  the 
"Final  FCA",  all  the  FCAs  for  FEMO,  cumulative  dropping,  together  as  one 
printout;  likewise  for  FEMO  single  dropping.  Sets  of  CC  No.  4  are  stacked 
one  after  the  other. 

When  preparing  CC  No.  4,  include  in  one  group  DIV  descriptions  of  the 
same  order.  Within  one  of  these  groups,  it  is  not  necessary,  but  slightly 
faster,  to  include  together  Dr/  descriptions  having  the  same  factors. 
Arrange  the  groups  by  increasing  order. 

The  program  works  even  if  the  natural  (increasing)  order  of  factors 
in  a  DIV  description  is  violated.  For  example,  writing  either  1*1  x  2.1 
or  2.1  x  1*1  is  possible. 


Control  Card  No.  5 

(Optional  -  omit  if  NCC5  =  0  or  blank  and  GDC  =  blank. ) 


Column: 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

n 

12 

13 

14 

15 

16 

17 

18 

□ 

1 

LiL 

□ 

E 

□ 

E 

□ 

7 

0 

□ 

□ 

□ 

□ 

□ 

□ 

□ 

| 

\ 

Note: 

Tl 
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bI  d  1 

1  A 

.  , 

CC  No.  5  (Cont*d) 

Note  A:  two  columns  for  the  OCIV  number. 

Note  B:  the  parentheses  are  here  only  for  the  purpose  of  conforming 
with  the  printout.  These  columns  could  be  left  blank. 

Note  C:  one  column  for  the  power. 

Note  D:  X  indicates  multiplication. 

Each  CC  No  5  describes  a  CIV  to  be  deleted  from  the  automat ically 
generated  set  of  CIVs  or  to  be  generated  and  included  with  any  CIVb  which 
may  have  been  automatically  generated.  Hence  the  number  of  CCs  No.  5 
equals  the  number  of  CIVs  to  be  deleted  or  to  be  "hand-generated"  and 
is  given  in  columns  17-19  of  CC  No.  1. 

If  TP  >  0  and  no  GCIVs  are  to  be  generated,  P  in  column  15  of  CC  No.  1 
must  be  set  equal  to  one. 

As  an  example,  GCIV  2(1)  x  4(2)  (x^  in  the  usual  notation)  is  written 
in  the  format  illustrated  above .  In  this  example  the  power  sum  is  3 ;  the 
max  in  rum  power  sum  for  a  GCIV  is  21.  No  GCIV  may  contain  more  than  6  OCIVs 
and  no  OCIV  can  be  raised  to  a  power  greater  than  9-  CC  No.  5  input  is  the 
only  way  to  obtain  GCIVs  with  power  sums  >  6. 

When  preparing  CC  No.  5,  include  in  one  group  CIV  descriptions  with 
the  same  power  sum.  Arrange  these  groups  by  increasing  power  sum.  On  any 
CC  No.  5  the  OCIV  numbers  must  be  in  increasing  order  from  left  to  right. 


Control  Card  Ho.  o 


(Optional 

-  omit 

when 

there  are 

no  quantitative 

factors. ) 

Colvuin: 

1  2 

3 

12 

13  •  •  •  22 

23  •  ’  •  32 

2  |  3-5 

6.5 

1 _ _ 

C 

l  J 

Factor  number  of 
Quant itative  factor 


A  set  of  CC  No.  6  gives  the  (uncoded)  quantitative  factor  levels  for 
one  of  the  factors  which  are  indicated  aB  being  quantitative  on  CC  No.  3. 
These  sets  of  CC  No.  6  should  be  in  the  same  order  as  the  quantitative 
factor  numbers  are  entered  on  CC  No.  3- 
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CC  Mo.  6  (Cont'd) 


In  columns  1-2  of  the  first  of  a  set  of  CC  No.  6  for  a  particular 
factor  there  must  be  punched,  right  adjusted,  the  factor  number.  In 
columns  3-12  the  value  of  the  first  level  is  entered,  in  columns  13-22 
the  value  of  the  second  level,  ...,  in  columns  63-72  the  value  of  the 
seventh  level.  The  value  of  factor  level  8  vould  begin  on  the  next 
card  in  columns  3-12  (with  columns  1-2  blank)  and  so  on  until  the  values 
of  all  levels  have  been  entered  for  this  factor.  The  level  values  are 
read  with  an  F10.5  format. 

As  an  example,  the  values  of  the  three  levels  of  quantitative  factor 
number  2  are  entered:  3-5,  6.5,  and  11.5  for  the  first,  second,  and  third 
level,  respectively. 

Data  Cards 

1.  The  1st  data  card  (optional  -  emit  if  regression  only:  E  =  0)  gives 
the  cell  identification,  using  two  columns  for  each  factor  and  as  many 
cards  as  necessary,  until  a  level  number  has  been  entered  for  each  factor. 
If  there  are  several  observations  of  the  dependent  variable(s)  y  in  one 
cell,  the  first  data  card(s)  must  be  repeated  for  each  of  them. 

Example:  Cell  13 1 

Column:  12345678 


1 — 

"I 

1 

1 _ 

•  1 

1 

-  :  3 

!  1 

1 

_ 1 - 

Factor  No. :  1  2  3  4 


2-  The  2nd  data  card  gives  the  vahM(s)  of  the  dependent  variable(s)  y 
in  the  cell  identified  by  the  1st  data  card  and/or  associated  with  the 
OCIV  values  entered  on  the  3rd  data  card(s).  Each  y-value  occupies  10 
columns  beginning  with  columns  11-20.  The  values  of  up  to  4  dependent 
variables  may  be  entered  depending  upon  the  value  of  NC6QDV  in  column  3 
of  CC  No.  1.  In  order  to  specify  the  last  "2nd  data  card",  columns  1-4 
of  the  2nd  data  card  must  have  the  characters  LAST.  Otherwise,  columns 
1-10  are  blank.  The  y-values  are  each  read  with  an  FIO.5  format. 

Example:  N060DV=2;  this  card  is  the  last  2nd  data  card  in  the 
problem  deck;  the  value  of  the  first  dependent  variable  is  2.0  and  the 
value  of  the  second  dependent  variable  is  4.5. 


column: 

1 

2 

3  4 

5  6 

7 

3  9 

10 

11  • • •  20 

21  •  •  •  30 

lli 

|  S  j  T  I 

|  1  | 

1 _ i_J 

□ 

1  1 

□ 

2.0  ! 

^*5 _ 

* 
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3.  The  3rd  data  card  (optional  -  colt  when  there  are  no  OCIVs:  TP  *  0) 
gives  the  (uncoded)  values  of  the  OCIVs  as  they  are  observed  together  with 
the  y-values(a)  entered  on  the  2nd  data  card.  The  number  TP  of  OCIVs  is 
given  in  column  12-13  cf  CC  No.  1.  Each  OCIV  value  occupies  10  columns, 

8  values  per  card.  Each  OCIV  value  is  read  with  an  P10.J  format.  If 
there  are  several  observations  of  the  dependent  variable^ )  y  for  a  given 
set  of  OCIV  values,  the  third  data  card(s)  must  be  repeated  for  each 
observation. 

3.1.2  Model  Generation  Options 

As  can  be  seen  from  the  description  of  the  Control  Cards, 
the  generation  of  the  N  IVa  of  the  NCVACCM  model  is  controlled  by  the 
entries  in  columns  1  to  19  (except  column  3)  of  CC  No.  1  and  by  the 
entries  in  Control  Cards  No.  2  to  5.  The  options  for  the  generation  of 
the  FIMO  part  of  the  model  will  be  discussed  first. 

The  data  classification  to  be  arisdyted  by  NOVACCM  may 
have  a  maximum  number  of  E*99  factors  (column  1,  CC  No.  1).  Since  the 
limitation  of  the  number  N  of  IVs  in  the  model  is  139,  the  restriction 
on  the  feasible  number  of  factors  will  be  severe  in  most  cases. 

The  order  D  (cdhmna  4-5  of  CC  No.  1)  up  to  which  the  program 
will  automatically  generate  DIVs,  cannot  be  larger  than  6.  This  means  that 
the  highest  order  interaction  which  can  be  included  in  the  FEMO  part  of  the 
model  is  that  among  6  factors.  (The  maximum  order  6  of  DIVs  is  also 
reflected  by  the  specifications  of  CC  No.  4.)  For  example,  if  the  user 
has  a  case  with  E=6  factors,  fx,  fz,  ...»  /«,  and  specifies  D-6,  the 
program  will  automatically  generate  fl^Fj-1  DIVs,  where  Fj  is  the  number 
of  levels  of  factor  A,  J  -  1,2,..,, 6.  With  F,  ■  2,  that  with  a  f  - 
data  classification,  c  -1-6 3  DIVs  will  b«  automatically  generated.  The 
upper  limit  of  the  number  of  automatically  generated  IVs  (DIVs  and/or  CIVs) 
is  ?55*  For  example,  the  DIVs  of  a  four- factor  classification,  where  each 
factor  has  4  levels,  ccwld  be  generated  automatically  by  specifying  E=D=4: 
There  axe  4*-l»255  DIVs  in  the  full  model  of  this  case.  However,  since 
the  model  limitation  is  N-139,  at  least  25;  -  139  =  116  DIVs  would  have 
to  be  deleted  from  the  autce*tically  generated  set  of  255  DIVs.  The 
deletiorv  in  this  case,  would  have  to  be  done  via  CC  No.  4  by  specifying 
the  116  DIVs  to  be  deleted. 

The  program  variable  GDD,  column  6  of  CC  No.  1,  controls 
the  4  options  of  the  generation  of  the  FQ40  part  of  the  model;  see  further 
below. 


The  number*  NCC4  of  Control  Cards  +  (columns  7-9  of  CC  No.  1) 
equals  the  number  of  DIVs  to  be  deleted  or  "hand" -generated.  The  ranee  of 
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NCC4  indicates  that  it  is  theoretically  possible  to  generate  up  to  139  DIVs 
by  hand  (i.e.,  to  go  to  the  limit  of  N=139  IVs),  or  to  delete  up  to  255 
DIVs  from  the  automatically  generated  set  of  up  to  255  DIVs.  In  general, 
the  user  will  generate  or  delete  considerably  fewer  DIVs  by  means  of  CC  No.  4. 

When  the  number  NSCC4  (columns  10-11,  CC  No.  l)  of  Bets  of 
CC  No.  4  is  larger  than  1,  the  number  of  DIVs  =  number  of  cards,  NCC4,  is 
the  same  for  each  of  these  sets.  The  only  reason  for  having  two  or  more 
sets  of  CC  No.  4  is  the  presence  of  two  or  more,  respectively,  confounded 
constants  for  the  data  layout,  see  Appendix  A.  When  the^e  are  two 
confounded  constants,  it  means  that  2  models  can  be  fitted  which  differ 
only  in  one  D1V,  that  is,  N-T- 1  DIVs  are  identical  in  both  models.  The 
one  remaining  DIV  represents,  in  each  "possible"  model,  a  different  one- 
degree-of- freedom-effect,  where  these  two  effects  are  completely  confounded. 
If  more  than  two  constants  are  confounded  for  a  data  layout,  there  are  more 
than  two  "possible"  models  which  are  represented  by  more  than  two  seta  of 
CC  No.  4.  FEMO  executes  the  ranking  of  the  factorial  effects  for  all  NSCC4 
models  put  in  by  the  corresponding  number  of  sets  of  CC  No.  4.  See  also 
Section  3.3-2  and  Example  No.  5  in  Section  3-4.5. 

Obviously,  there  is  a  number  of  possibilities  to  generate 
the  FEMO  part  of  the  NOVACCM  model.  The  set  of  the  N-T  DIVs  (with  possibly 
T=0)  of  the  model  may  be  generated  in  the  following  4  ways: 

I.  by  automatic  generation  only; 

II.  by  automatic  generation  and  "hand" -generation  via  CC  No.  4; 

III.  by  automatic  generation  and  deletion  via  CC  No.  4; 

IV.  by  "hand" -generation  only. 


I.  "Automatic  generation  only"  (by  the  program  variable  D,  columns 
4-5,  CC  No.  1)  Is  applicable  only  when  a  "full  model"  is  to  be  generated. 
For  the  ANOVA  part  in  FEMO,  a  "full  model"  means  that  all  factorial  effects 
up  to  a  given  order  (D)  are  to  be  generated  and  can  be  generated  which 
requires  the  presence  of  observations  in  all  cells  of  the  associated  data 
classifications.  For  example,  in  a  three-way  classification  with  factors 
a,  9,  and  C,  all  DIVs  representing  main  effects  and  2-factor  interactions 
may  be  generated  (D=2)  if  none  of  the  three  two-way  classification  tables 
has  empty  cells:  the  a  x  /?,  the  c7  x  £,  and  the  B  x  d  table.  (Note.  In 
the  above  discussion  the  absence  of  "identities"  was  assumed,  see  Appendix 
A.  ) 
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II.  "Automatic  generation  and  hand-generation"  of  the  DIVs  may  be 
used  in  order  to  save  on  input  writing  in  cases  where  a  "full  model"  is 
not  to  be  generated.  For  instance,  in  the  example  mentioned  under  "I" 
above,  the  user  may  wish  to  fit  and  be  able  to  fit  some  DIVs  representing 
degrees  of  freedom  of  the  three-factor  interaction  In  addition  to  the  full 
model  of  order  D=2.  Rather  them  writing  all  N-T  DIVs  of  the  model  on 

CC  No.  U  (which  the  user  may  do  if  he  wishes  to,  see  "IV."  below),  the 
user  may  automatically  generate  the  second  order  model  (D=2)  and  write 
only  the  additional  third  order  DIVs  by  moans  of  CC  4.  See  example  k  in 
Section 

III.  "Automatic  generation  and  hand-deletion."  Again  taking  the 
above  examile,  this  third  way  of  model  generation  enables  the  user  to 
automatically  generate  the  third-order  model  (D=j)  and  then  to  write  on 

CC  No.  4  the  DIVs  representing  those  degrees  of  freedom  of  the  three  factor 
interaction  which  are  not  wanted  in  the  fit  (or  cannot  be  fitted)  and  are 
to  be  deleted.  (See  Example  5  in  Section  3.4.5.)  The  method  of  Input 
(II  or  III)  in  such  a  case  is  left  to  the  user.  In  general,  the  user 
will  choose  the  way  which  means  the  least  amount  of  input  writing  via 
CC  No.  1*.  * 

IV.  "Hand-generation  only."  This  option  may  be  useful  in  some 
cases  when  the  input  writing  of  DIVs  to  be  deleted  from  an  automatically 
generated  set  involves  more  work  (and  more  possibilities  of  writing 
errors  I )  than  would  be  encountered  in  specifying  the  whole  set  of  N-T  DIVs, 
to  be  generated,  on  CC  No.  See  Example  6  in  Section  3.V.6. 

The  options  for  the  generation  of  the  CCMO  part  of  the 
NOVACf.i  model  are  very  similar  to  those  of  the  FEMO  part. 

The  program  variable  P,  columns  lJi-l5,  CC  No.  1,  is  the 
"power-sum"  up  to  wnich  the  program  will  automatically  generate  CIVs. 

The  power-sum  is  defined,  ar  the  name  suggests,  as  the  sum  of  all  powers 
in  a  CIV.  The  GCIV  x^x&x*,  for  example,  has  a  power  sum  of  2+3+l=6. 

The  power-sum,  P,  up  to  which  the  program  will  automatically 
generate  CIVs,  is  equivalent  to  the  "order"  of  the  CIV-model  as  it  was 
called  in  Section  2.2.1.  The  reader  is  referred  to  that  section  and  to 
formula  (2-lU)  giving  the  total  number,  T,  of  CIVs  in  the  model  when  P 
Is  specified.  A  "full  model"  of  order  P  means  that  all  GCIVs  may  be 
generated  which,  in  general,  is  the  case  if  no  linear  dependencies  are 
introduced  into  the  matrix  A  by  this  generation.  For  a  more  detailed 
discussion  of  linear  dependencies  ("obvious"  and  "non-obvious" )  in 
regression  models  see  Reference  2. 

While  the  upper  limit  for  P  is  6,  GCIVs  of  higher  order,  or 
larger  power  sum,  may  be  "hand" -generated  by  means  of  CC  No.  5.  The  maximum 
pover  sum  of  GCIVs  thus  generated  is  21. 
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The  options  I  -  IV  described  before  for  the  generation  of 
DIVs  apply  correspondingly  for  the  generation  cf  CIVs.  The  U  options  are 
controlled  by  the  program  variables  GDC  (column  16)  an-  NCC3  (columns 
17-19)  of  CC  No.  1.  As  a  consequence  of  the  k  options  to  generate  each 
part  of  the  NOVACCM  model,  there  are,  in  a  model  which  is  to  contain 
both  CIVs  and  DIVs  (i.e.,  in  an  analysis  of  covariance  model),  Ux4=l6 
different  ways  of  generating  the  same  model,  provided  one  deals  with 
"full  models"  in  both  DIVs  and  CIVs. 

3 .1.3  Ranking  Options 

The  ranking  options  in  COHO  and  FEMO  are  controlled  by  the 
entries  in  columns  2*4  to  38  of  CC  No.  1. 

The  input  value  for  the  program  variable  DROP,  column  2k, 
determines  whether  cumulative  dropping  alone  is  performed  in  CCMO  and/or 
FQ40  (DROP  =  0),  or  both  cumulative  and  Bingle  dropping  (DROP  =  1).  Since 
in  single  dropping  the  ranking  order  of  CIVs  or  factorial  effects  is  taken 
from  the  order  established  with  the  cumulative  dropping,  the  additional 
running  time  for  single  dropping  is  small.  Therefore,  the  program  user 
will  probably  choose,  in  most  cases,  the  option  for  both  cumulative  and 
single  dropping. 

The  difference-toetween  the  two  dropping  procedures  is  in  the 
determination  of  the  significant  model.  Since  the  two  ranking  orders  are 
identical,  the  difference  between  the  procedures  is  only  the  step  at  which 
the  "non-significancc,"  i.e.,  the  l(X)-value,  reaches  a  given  significance 
level  a  (see  columns  26-37  of  CC  No.  1).  In  some  cases,  the  two  significant 
models  will  be  identical;  in  other  cases,  they  will  be  different.  Therefore, 
the  user  who  chooses  the  option  for  both  dropping  procedures  may  face  the 
problem  cf  having  to  decide  between  two  significant  models. 

The  problem  is  similar  to  the  or.c  encountered  in  "orthogonal" 
ANOVA:  Should  one  pool  the  non-significant  effects  into  the  error  term  or 
not?  Whereas  in  orthogonal  ANOVA  the  reason  for  pooling  is  usually  the 
desire  to  increase  the  number  of  the  degrees  of  freedom  for  error,  the 
pooling  in  NOVACQM  is  a  feature  of  the  ranking  method  employed  here. 

There  is  pooling  in  both  dropping  procedures.  In  cumulative  dropping, 
the  pooling  takes  place  in  the  numerator  mean  square  of  the  F-value  (of 
the  Main  Theorem,  see  (2-10)  in  Section  2.1.2),  whereas  in  single  dropping 
the  pooling  takes  place  in  the  denominator  mean  square.  Among  the  two 
ranking^ procedures  of  NOVACCM  the  cumulative  dropping  procedure  obviously 
is  the  "right"  one,  because  single  dropping  implies  a  redefinition  of  the 
molel  at  c-ach  step  according  to  an  intermediate  remit  of  the  analysis. 
However,  the  single  dropping  is  provided  as  an  additional  procedure  since 
cumulative  dropping  tends  to  be  less  powerful  than  single  dropping.  An 
example  may  serve  as  an  illustration;  In  the  ranking  order,  as  established 
by  FEMO,  cumulative  dropping,  of  the  factorial  effects  in  a  given  problem, 
the  first  k-1  effects  ranked  as  least  important  may,  in  the  true  model  of 
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the  problem,  be  non-existent.  Their  associated  mean  square  is  then  an 
estimate  of  the  error  variance.  The  k^h  effect,  however,  may  exist  and 
may  have  a  relatively  large  contribution  to  the  numerator  mean  square  in 
the  Fc -value  of  the  Main  Theorem.  But  as  a  consequence  of  the  pooling 
with  the  k-1  non-existing  effects,  the  I(X)-value  at  Step  k  in  F1MQ  may 
only  be  slightly  decreased  as  compared  to  Step  k-1.  In  contrast  to  this, 
single  dropping  would  "detect"  the  significance  of  the  least  innortant 
effect  since  it  would  have  pooled  all  the  k-1  non-existing  effects  into  the 
enor  term  where  they  actually  belong  according  to  the  assumption  made  for 
this  example*,  the  l(X)-value  at  Step  No.  k  of  single  dropping  would 
definitely  be  much  smaller  than  that  of  Step  No.  k-1,  and  would  indeed 
possibly  reach  a  given  significance  level  a. 

In  ma'.iy  cases  the  gap  between  the  two  significant  models 
may  be  closed  by  applying  essentially  the  equivalent  of  the  alternative 
ranking  procedure  discussed  in  Section  2.1.2,  provided  one  makes  the 
(reasonable)  assumption  that  the  ranking  orders  resulting  from  the  cumulative 
and  the  alternative  ranking  procedures  would  be  identical.  In  that  case, 
the  non-significant  I(X)-values  at  the  steps  of  the  cumulative  dropping 
procedure  indicate  which  null  hypotheses  may  be  accepted.  At  each  such 
step  where  all  previous  null  hypotheses  could  be  accepted,  the  additional 
regression  sum  of  squares  due  only  to  the  effect  ranked  at  that  3tep  may 
bo  divided  by  the  associated  degrees  of  freedom  to  give  e  mean  square 
("DIFF  MS"  in  the  FCA;  see  Section  5- 3-1)  which  has  expectation  a*  if  the 
null  hypothesis  concerning  the  effect  ranked  at  the  step  is  true.  Therefore, 
if  in  the  single  procedure  the  I(X)-value  reaches  a  given  significance 
level  a*  at  a  step  No.  kj.,  say,  and  if,  in  the  cumulative  dropping  procedure, 
I(X)  roaches  a*  at  a  later  step  No.  ks,  say,  (ka  >  kx),  one  can  divide 

BIFF  MS  from  Stop  kx  in  the  cumulative  procedure  by  the  original  error 

mean  square  based  on  n-N-1  degrees  of  freedom  to  form  a  valid  F  test.  If 

this  1-- value  is  significant  at  the  same  level  as  the  F-value  at  Step  kj. 

in  the  single  dropping  procedure,  the  two  significant  models  arc  identical 
and  the  gap  is  closed.  See  also  Section  3.3. 1  and  the  discussion  of  the 
numerical  examples  in  Section  3.4. 

The  user  of  NOVACCM  should  keep  in  mind  that  the  main 
purpose  of  the  program  is  to  screen  incomplete  and  unbalanced  data 
classifications  for  significant  factorial  effects.  NOVACCM,  by  its  nature, 
cannot  always  give  clear-cut  answers  such  as  may  be  possible  in  "orthogonal" 
analysis  cf  variance. 

Therefore,  if  the  gap  between  the  two  significant  models 
cam i  1  b-  uosed  ny  any  means,  the  statistician  may  conclude  that  the 
t  ru  .'igniii  ant  model  is  between  the  two  models  from  the  two  procedures. 

This  situation  will  indicate  the  need  for  additional  efforts  to  further 
analyse  the  given  body  of  data.  (See  Example  6  in  Section  3.4.0.) 
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The  variable  KALFHA  (column  25  in  CC  No.  1)  specif  lea  which 
of  the  possibly  up  to  three  significance  levels  a  (columns  2u  to  37)  will 
be  used  for  the  determination  of  the  significant  model  in  COMO.,  cumulative 
dropping,  when  the  model  also  contains  a  FQ40  part.  That  in,  KALPHA  has 
importance  only  in  the  case  of  an  analysis  of  covariance.  (In  case  of 
regression  only  the  specified  value  of  KALPHA  has  no  influence  upon  the 
printout. ) 

The  variable  KALHiA  would  not  have  been  required  In  NOVACOM 
(for  analysis  of  covariance  cases)  assuming  there  was  only  une  cr-value 
(instead  of  throe)  to  be  specified  for  the  determination  ol'  the  significant 
model.  However,  for  reasons  given  below,  up  to  three  such  or-lovels  may 
be  specified  by  the  user  and,  therefore,  one  of  them  haa  to  be  chosen  for 
COMO  by  means  of  the  variable  provided,  KALPHA.  (The  program  could  have 
been  constructed  such  that  up  to  three  FEMOs  wculd  have  been  performed 
corresponding  to  the  three  or-valuee;  however,  this  possibility  was 
disregarded  in  order  not  to  add  unnecessarily  to  the  computer  running 
time  of  a  given  problem.) 

Since  KALPHA  determines  which  one  of  the  throe  significance 
levels  is  to  be  used  in  CCHO,  cumulative  dropping,  for  FEMO,  and  considering 
the  fact  that  cumulative  dropping  in  some  cases  yields  significant  models 
which  contain  fewer  terms  than  actually  arc  significant,  the  choice  of 
the  three  ar-values  and  KALHIA  should  be  made  accordingly.  That  in,  in 
analysis  of  covariance  It  will  be  advantageous  to  choose  the  first  of  the 
three  ar-values  larger  than  actually  desired  for  the  determination  of  the 
significant  model  and  then  set  KALFHA  =>1.  For  example,  if  o  =  0.01  i3 
the  desired  level  for  the  significant  model  in  un  analysis  of  covariance 
case,  one  could  choose  ori  »  0.05,  Or;?  =  0,01,  and  03  -  0.001,  nay,  where 
then,  with  KALHIA-1,  the  significant  CIVs  to  be  carried  through  the  FEMO 
part  of  the  ranking  would  be  determined  at  the  0,05  level  of  significance 
in  COMO,  cumulative  dropping.  See  also  Example  o  In  Section  3.i.o. 

The  above  is  one  reason  for  having  the  possibility  to 
specify  Sioi'fl  than  one  or-level  of  significance  in  rioVACOM.  Another  reason 
is  that  the  program  gives  a  "full  printout"  containing  all.  pertinent 
information  for  a  given  step  of  the  ranking  only  when  u  specified  ©-level 
is  reached  by  I(X).  By  specifying  the  maximum  id’  5  a-levels,  say,  IIOVACCM 
will  give,  at  the  most,  3  full  printouts  for  both  cumulative  and  single 
dropping  in  both  CCMO  and  FEMO  (other  than  for  the  first  good  step), 
provided  the  3  levels  are  reached  by  1  (X )  at.  dilfeient  slops  in  each 
procedure.  (Note.  NCVACCM  could  have  been  constructed  such  that  the 
"full  printout"  would  have  been  given  at  each  step  of  the  rankings,  however, 
this  possibility  was  disregarded  because  of  problem  running  tin.c  e.iisidcratlons. 

The  analyst,  in  chousing  more  thun  cue  si;  nil  i  an  -1  J.  vel 
and  in  obtaining  the  corresponding  full  printouts,  j .iwd  i  -d  to  l  reuuen 
his  judgment  concerning  the  significance  of  the  CIVr  und.i  1  fac'  -rial 
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effoctB.  By  choooing  two  "neighboring"  significance  levels  in  addition 
to  the  on<*  principal  level  decided  upon  in  advance,  the  use)'  obtains  all 
information  about  the  models  which  would  have  resulted  from  the  other 
two  significance  levels.  In  general,  the  user  will  choose,  if  he  makes 
use-  of  the  option,  one  o-level  above  and  one  below  the  one  principal 
significance  level.  For  example,  he  may  have  chosen  a  ■»  0.0;  as  hla 
principal  significance  level.  He  would  vrite  a  ■  0.03  into  columns 
JO-33  of  CC  No.  1  na  ALPHA(2)  if  he  would  like  to  specify  two  additional 
values.  Those  could  he  ALFHA(l)  =  0.10  and  ALPHA(3)  ■  0.01,  for  example. 

It  should  bo  noted  that  the  a-valuea  must  br  put  in  by  decreasing  order, 
tnat  is,  ALFHA(l)  >  ALPHA (t-')  >  ALfflA(j).  Tliio  order  is  in  accordance 
with  the  characteristics  of  the  backward  ranking  technique. 

The  variable  CAP  (column  38  of  CC  No.  l)  specifies  whether 
to  use  restricted  or  unrestricted  (relaxed)  admissibility  rules  in  the 
ranking  process.  The  terms  "restricted"  and  "unrestricted"  concern  only 
the  admissiblity  rules  for  ranking  CIVs  and/or  factorial  effects  containing 
quantitative  factors.  (The  ranking  of  factorial  effsets  with  respect  to 
the  qualitative  factors  contained  in  them  is  always  done  under  "restricted" 
admissibility  rules.)  The  choice  ol‘  the  type  of  admissibility  rules  is 
entirely  up  to  the  user.  Restricted  admissibility  in  the  ranking  of  CIVs 
or  factorial  effects  containing  quantitative  factors  is  necessary  if  t:  e 
option  for  coding  the  OCIV  values  and  level  values  is  chosen  (in  order  to 
achieve  computational  accuracy)  provided  the  user  wishes  to  have  a 
possibility  to  retransform  the  resulting  values  (for  example,  the  regression 
coefficients)  without  changing  the  significant  model,  see  Abt  et  al.  [1966]. 
Another  application  of  restricted  admissibility  is  when  a  significant  model 
is  desired  which  is  to  contain  all  polynomial  terms  having  lower  order  than 
the  significant,  terma.  For  example,  a  significant  model  may  result,  under 
unrestricted  admissibility  rules,  which  contains  only  second  order  terms. 

If  the  user  wishes,  in  this  case,  for  reasons  of  physical  interpretation, 
a  model  also  containing  the  first  order  (linear)  terras,  he  can  achieve  this 
by  applying  the  restricted  admissibility  rules. 

Ranking  under  restricted  admissibility  rules  is  also  the 
only  means  to  air.ve  at  a  breakdown  of  the  sums  of  squares  which  corresponds 
to  the  breakdown  achieved  by  the  method  of  orthogonal  polynomials.  As 
known,  orthogonal  polynomials  are  constructed  such  that  each  polynomial 
term  is  fitted  "in  addition"  to  all  previously  fitted  terms.  This  holds 
independently  of  the  fact  whether  one  uses  orthogonal  coefficients  In  the 
case  of  equidistant  levels  or  actually  constructs  the  polynomials  In  the 
case  of  non*eqaidisUu\t.  Jovcls.  Therefore,  in  the  use  of  orthogonal 
polynomials,  the  quadratic  contrast  as  such,  for  example,  is  meaningless; 
only  the  fact  that,  it  is  fitted  in  addition  to  the  linear  contrast  gives 
it  meaning.  This  feature  of  fitting  "in  addition"  to  all  lower  order  terms 
is  achieved  by  the  restricted  admissibility  rules  in  NOVACOM. 
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If  none  of  the  three  discussed  reaaona  for  using  restricted 
admissibility  rules  in  the  ranking  are  present,  the  analyst  nhould  choose 
the  option  lor  "unrestricted  admissibility"  (CAD=1).  He  may  even  have  a 
strong  reason  to  do  so  because  the  relationship  he  ia  analyzing 
statistically  may  be  theoretically  known  to  contain  no  linear  terms,  for 
example.  One  can  often  observe  that  ranking  (by  NOVACCM)  of  CIVs  or 
effects  under  "unrestricted  admissibility"  leadB  to  aignificant  models 
containing  many  fewer  terms  than  result  from  ranking  under  restricted 
admissibility.  (Sec  Example  3  in  Section  )  That  la,  unrestricted 

admissibility  may  lead  to  a  significant  model  in  which  a  minimum  number 
of  terms  can  explain  a  maximum  of  the  total  variability. 

J.2  Running  Time  Formula 

The  running  time  needed  by  NOVACCM  for  a  given  problem  obviously 
is  dependent  upon  many  parameters.  In  order  to  find  an  approximate  time 
formula,  a  prediction  equation  was  evaluated  by  applying  the  pregram  DA-MRCA 
to  the  actual  running  times  of  a  number  of  problems  which  had  been  run  with 
NOVACOM  on  the  IBM  7O3O  (STRETCH).  In  this  study,  the  actual  running  time 
used  by  NOVACCM  was  the  "dependent  variable",  y,  in  minutes.  As  "independent 
vuriables"  (OCIVs,  that  is)  the  following  three  variables  were  used: 

*1.  -  N  »  number  of  IVs;  xa  ■  G  «  number  of  factorial  effects;  and 
X3  *  n  *  number  of  observations.  The  BIVOH-subroutine  of  DA-MRCA  led  to 
a  more  concise  formula  as  compared  to  the  one  resulting  from  IVOR.  in 
order  to  account  for  the  time  consumption  caused  by  some  of  the  other 
parameters,  the  coefficients  V),  S,  and  H  (see  below)  are  Introduced  in  the 
formula.  All  actual  running  times  which  entered  the  least  squares 
evaluations  m  e  from  problems  where  the  ranking  option  for  both  cumulative 
and  r  ingle  dropping  was  chosen.  In  previous  studies,  only  little  differences 
were  noted  between  the  running  times  of  rankings  with  restricted  and 
unrestricted  admissibility  rules.  The  parameter  rcstrlctcd/unrest, ieted 
admissibility  Is,  therefore,  neglected  in  the  formula.  The  number  of  full 
printouts  in  the  problems,  whose  times  were  used  in  the  evaluation,  may  be 
considered  aa  representative  of  the  typical  problem. 

The  formula  is  as  follows  (T  =  NOVACCM  -  time  in  minutes  on  IBM  7030 
STRETCH): 

T  =  0.0  «■  [llAnN  +  20HGN3  +  O.l'jyGir1] 

j.cf 

where  the  symbols  have  the  following  meaning: 

W  =  number  of  dependent  variables 

S  number  of  sets  of  CC  No.  A 

II  -  number  of  ANVAs  performed  (this  must  be  estimated  since  H  is 
dependent  upon  the  results  of  the  analysis) 
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n  =  number  of  observations 
N  =  number  of  IVs 

G  =  number  of  effects,  (in  case  of  multiple  regression,  use  G=N. ) 

In  the  use  of  the  formula,  the  third  term  in  the  expression  in  brackets 
may  be  neglected  as  long  as  n  <  100,  say.  Only  in  analysis  of  covariance 
has  H  to  be  estimated.  In  the  two  other  types  of  problems,  H=l.  S  >  1 
applies  only  in  cases  with  confounded  constants;  otherwise,  S=l. 

Since  the  estimated  standard  deviation  for  the  least  squares 
fit  of  3 IV OR  was  0.9  (minutes),  the  formula  is  not  very  precise  for  the 
running  times  of  small  problems.  (See  also  the  running  times  printed 
for  the  examples  in  Section  For  example,  for  G=10,  N=20,  and  n=100 

(W=S=H=1,  say),  the  formula  yields: 


T  =  0.6  +  lCT6[llU-100-20  +  26-10-h00  +  0.159-10-10000] 

=  0.6  +  icre[  228000  +  10l+000  +  15900] 

=  0.6  +  O.3I8  =  0.9h8  a*  l(minute). 

However,  the  actual  running  time  for  a  case  like  this  may  be  as  much  as 
three  minutes.  The  relative  accuracy  is  much  better  for  large  cases  for 
which  the  formula  is  mainly  intended.  In  the  largest  case  used  for  the 
time  study,  where  n-768,  11=138,  G-88,  the  actual  running  time  of  NOVACCM 
was  minutes.  The  predicted  time,  by  the  formula  given,  for  this  case 
is  64.5  minutes. 


Interpretation  of  Results 


In  this  section  the  meaning  and  use  of  the  results  contained 
in  the  Pinal  Comprehensive  Analysis  printouts  of  NOVACOM  will  be  discussed. 
The  formulation  of  the  complete  printout  for  a  problem  is  not  given  in 
algebraical  terms.  However,  the  complete  printout  is  discussed  with 
Example  No.  1  in  Section  3-^-l-  For  a  general  formulation  of  the  complete 
printout  see  Herring  [1967]  and  the  general  interpretation  of  the  printout 
of  DA-MRCA  (Reference  2)  which  is  similar  to  the  complete  printout  of 
NOV.  7CM.  (note.  "Complete  printout"  means  the  entire  printout  for  a 
problem;  whereas  the  "full  printout"  consists  of  the  pertinent  data  at 
a  significant  step  in  the  ranking. ) 


The  Final  Comprehensive  Analysis 


The  format  of  the  Final  Comprehensive  Analysis  is  the  same4 
for  COMO,  FEMO,  and  for  the  ANVAs.  (In  case  of  more  than  one  set  of 
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CC  No.  4,  the  FCAs  of  FFMO  are  merely  repeated  as  the  "Final  FCA"  at  the 
end  of  a  problem  printout.)  If  the  corresponding  option  is  requested 
(column  24  of  CC  No.  1),  the  FCA  is  printed  for  both  the  cumulative  and 
the  single  dropping  procedure  in  CCMO  and/or  FEMO. 

The  FCA  format  has  12  columns  which  are  discussed  below, 
starting  from  the  left. 

"STEP" .  This  column  gives  th*»  Btep  number  of  the  ranking  procedure. 
(These  are  the  same  numbers  which  identify  the  "full  printouts.")  If  CCMO 
has  been  part  of  the  ranking,  FEMO  will  always  start  again  with  step 
number  "1. "  The  step  numbers  are  not  influenced  by  the  fact  that  one 
or  more  models  could  not  be  accepted  by  the  program.  (See  Example  7  in 
Section  3.4.7. ) 

"CQNC  VAR"  (COMO)  or  "EFFECT"  (FEMO).  The  second  column  from  the 
left  gives  the  symbol  of  the  concomitant  variable  ranked  at  a  given  step 
of  CCMO  or  the  symbol  of  the  effect  ranked  at  a  given  step  of  FEMO.  The 
symbols  used  are  explained  in  Section  2.2. 

"PRC" .  In  this  column  the  PRoCedure  is  indicated  which  was  used  in 
the  ranking  of  an  effect  at  a  given~’step  of  FEMO.  Depending  upon  whether 
the  **-,  or  ***  -procedure  occurred  at  the  given  step,  the  corresponding 
symbol  is  printed  in  this  column.  Also  printed  in  the  PRC  column  is  an 
asterisk  if,  at  the  given  step,  l(X)  reached  one  of  the  up  to  three 
specified  significance  levels  a  for  the  first  time.  Since  even  the 
smallest  specified  a-value  will  be  larger  than  the  value  Co  =  10"8  which, 
if  reached  by  I(X),  activates  the  +  -  (or  ++-,  or  **"*-)  procedure,  the 
printing  of  the  symbol  "  +"  (or  "  4,+  ",  or  "  +++")  always  has  predominance 
over  the  asterisk. 

An  asterisk  indicates  which  step  in  the  ranking  corresponds  to  the 
significant  model  for  the  ar-level  which  is  associated  with  that  asterisk. 
That  is,  the  CIVs  or  effects  whose  symbols  are  printed  at  the  step  number 
where  the  asterisk  occurs  and  at  all  higher  step  numbers  belong  to  the 
significant  model  corresponding  to  that  asterisk. 

'I(X)" .  This  c  lumn  gives  the  computed  value  of  l(X)  which  is 
associated  with  the  CIV  or  effect  ranked  at  a  given  step.  In  general, 
the  printed  I(X)-values  will  decrease  with  increasing  step  numbers.  Due 
to  the  behavior  of  the  values  which  enter  the  l(X)  computation,  however, 
the  l(X)-values  may  fluctuate  considerably  in  some  cases. 

Naturally,  the  asterisk  in  the  PRC  column  ac  a  given  step  corresponds 
to  the  l(x) -value  which  reaches,  for  the  first  time,  a  value  smaller  than 
or  equal  to  the  significance  level  a  associated  with  that  asterisk.  For 
example,  if  tho  first  asterisk  printed  corresponds  to  a  =  O.O5,  the  > 

l(X)-value  of  this  step  will  be  smaller  than  or  equal  to  0.05. 
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"DIFF  MS".  The  abbreviation  of  this  column  stands  for  "DUFerence 
Mean  Square . M  The  value  printed  is  the  difference  Detween  the  regression 
sums  of  squares  of  two  consecutive  steps  in  the  ranking,  divided  by  the 
degrees  of  freedom  of  the  CIV  (which  is  always  1)  or  the  effect  ranked 
at  the  given  step.  In  "orthogonal"  analysis  of  variance,  DIFF  MS  equals 
the  mean  square  which  one  would  obtain  in  a  regular  AN  OVA  table  (see 
Examples  2  and  3B  in  Section  3.1.). 

In  "non-orthogonal"  ANOVA,  as  was  discussed  in  Section  3. 1.3,  DIFF  MS 
can  be  used  as  a  basis  for  a  valid  F  test  only  when  the  null  hypotheses 
corresponding  to  sill  previously  ranked  CIVs  or  effects  could  be  accepted. 

The  user  would,  in  this  case,  divide  DIFF  MS  by  MS(2)  (given  in  a  column 
of  the  FCA  discussed  further  below)  to  obtain  an  F-value  whose  si_  -'.ificance 
he  caui  find  from  an  F- table. 

In  the  single  dropping  procedure  where  the  user  is  willing  to  redefine 
his  model  at  each  step  of  the  ranking  order  (which  was  established  by  the 
cumulative  ranking  procedure),  DIFF  MS  actually  is  the  basis  of  the  F-value 
printed  in  an  FCA  column  more  to  the  right. 

"DIFF  DIF".  The  abbreviation  of  this  column  stands  for  "DIFFerence 
Degrees  of  Freedom,"  and  the  number  DIFF  DF  printed  is  associated  with 
DIFF  MS  as  indicated  before.  Independently  of  that  association,  the 
DIFF  DF  column  takes  the  place  of  the  usual  degrees-of- freedom  column 
in  a  "regular"  ANOVA  table. 

”F".  This  column  gives  the  F-value  of  the  Main  Theorem,  see  (2-10) 
in  Section  2.1.2,  as  computed  for  the  CIV  or  effect  ranked  at  the  given 
step,  for  the  cumulative  or  the  single  dropping  procedure,  as  applicable. 

"MS(l)".  This  column  gives  the  casputed  mean  square  of  the  numerator 
in  F  of  the  previous  column.  In  single  dropping,  MS(1)  equals  DIFF  MS  for 
obvious  reasons. 

"DF(l)” .  This  is  the  number  of  degrees  of  freedom  in  the  numerator 
of  the  value  in  the  F-column.  If,  in  cumulative  dropping,  DF(l)  is 
multiplied  by  the  MS(1) -v&Jue  of  the  previous  column,  the  result  is  the 
"additional  regression  sum  of  squares",  SSN_N.  ,  of  the  Main  Theorem,  due 
to  the  N-N’  IVs  which  have  been  deleted  at  the  given  step  (when  the  number 
of  IVs  remaining  in  the  model  is  N’).  In  single  dropping,  one  has  merely 
DF(1)  =  DIFF  DF  and  DF(l)  •  MS(l)  =  DIFF  MS. 

"MS(2)" .  This  column  gives  the  computed  mean  square  of  the  denominator 
in  the  F-value  printed  in  the  F-column.  MS(2)  is  the  estimate  of  the 
residual  variance  and  is  used  as  such  in  the  ranking  procedure.  In 
cumulative  dropping,  MS(2)  remains  constant  through  all  steps  until  a  "  *", 
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or  "  ++",  or  "  +++"  la  printed  in  the  FRC-column.  (See  the  FB<0-part  of 
the  flowcharts  in  Section  2.4.1.)  In  single  dropping,  M8(2)  Is  redefined 
at  each  step  according  to  the  redefinition  of  the  model  at  each  step  in 
this  ranking  procedure. 

"DF(2)" .  This  is  the  number  of  degrees  of  freedom  in  the  denominator 
of  the  value  in  the  F-coluan.  DF(2)  is  associated  with  MS(2)  of  the 
previous  column  in  an  obvious  manner.  If  the  two  values  are  multiplied, 
the  result  is  the  residual  sum  of  squares. 

"COEFF  PET".  In  this  column  the  value  of  the  coefficient  of 
determination  is  primed  for  the  model  of  a  given  step  before  the  CIV  or 
effect  ranked  at  the  step  has  been  deleted.  This  means,  for  example,  that 
the  value  of  the  coefficient  of  determination  printed  at  the  first  step 
of  the  FCA  for  COMO  is  the  one  for  the  model  containing  all  N  IVs. 

3.3.2  The  Final  FCA 

In  case  there  is  more  than  one  set  of  Control  Cards  4  in  a 
given  problem  when  FEMO  is  used,  the  program  will  repeat,  at  the  end  of 
the  printout  of  the  problem,  all  FEMO  FCAs  which  had  been  printed  earlier 
at  the  ends  of  the  printouts  for  each  individual  set  of  CC  4.  The  reason 
for  printing  the  Final  FCA  is  to  ease  the  comparisons  between  the  results 
corresponding  to  the  various  sets  of  CC  4. 

In  genera.' ,  each  FCA  corresponding  to  a  set  of  CC  4  will 
show  a  different  significant  model.  It  should  be  remembered  that  each 
set  of  CC  h  corresponds  to  a  different  model  which  includes  one  possible 
selection  from  the  set  of  the  confounded  constants  (see  Appendix  A). 
Consequently,  there  are,  in  general,  as  many  models,  or  sets  of  CC  4 
(and,  therefore,  as  many  FBIOs  in  the  Final  FCA),  as  there  are  possible 
selections  from  the  set  of  the  confounded  constants.  The  Final.  FCA  then 
serves  in  finding  that  significant  model  which  contains  the  least  number 
of  significant  effects  which  is  then  called  "The  most  probable  significant 
model."  For  further  discussion  of  the  use  of  the  Final  FCA  the  reader  is 
referred  to  Example  5  in  Section  3.4.5. 


The  sequence  of  the  printout  of  the  individual  FCAs  in  the 
Final  FCA  is  as  follows,  assuming  the  most  general  case  where  several 
dependent  variables  have  been  analyzed  and  where  both  dropping  procedures 
have  been  performed  (using  the  actual  form  of  the  printout  in  the  identifi 
cation  of  the  individual  FCAs): 


FEMO/Yl/CUMUL/SET  1 
"  /Y  2/  "  /  " 

»  »  I  I 

III  I 

femo/yi/ cumul/ SET  2 

ti  yY2/  H  /  " 

•  I  I  I 

•  •  I  I 

fjmo/yi/ cumul/set  3 

«  yY2y  ..  >f  „ 

»ii  i 

»  »  t  i 

F£MO/Yl/srNGI£/SET  1 
"  /Y2/  "  /  " 

•ft  t 

»  »  I  » 

FB40/Y1/  SINOLE/SST  2 
"  /Y2/  11  /  M 

•  i  i  i 

•  »  i  i 


FEMO/Yl/SINGLE/SET  3 
"  /Y2/  "  /  ” 

*  t  »  r 

t»i  • 


t1*  ^erLDMj.  «lor  the  results  of  steps  with 
accepted  models  are  given.  This  is  the  only  possible  deviation  free,  the 
printouts  of  the  individual  FCAs. 

3-3»3  Additional  Analyses  of  Variance  ("ANVAb") 

.  .  .  ..  Aa  “«ntioned  earlier,  in  case  of  analysis  of  covariance 

when  significant  CIVa  were  found  in  COMO  (cumulative  dropping)  and 
accordingly  were  kept  in  the  model  through  the  FEMO  ranking,  the  program 
^f«hwatdi^0nal  rankijl6s  of  the  factorial  effects  for  the  dependent 

!SSj*(a2  efluslon  of  ^  crVs  f*«n  the  model  and  for  each 

significant  OCiv  and  for  each  OCIV  contained  in  a  significant  GCIV, 
provided  the  corresponding  option  (ALA  >  0  in  column  46  of  CC  No.  1)  is 

the  cumuiative  ranking  procedure  ia  performed  in  the 
ANVAs  which  are  actually  FEMOs  for  y  and  the  OCIVs  concerned. 
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The  ANVA  for  an  OCIV  may  shoo  factorial  effects  to  be 
significant  when  the  OCIV  is  not  a  "fixed"  variate  (i.e.,  does  not  correspond 
to  the  theory  of  analysis  of  covariance)  but  is  a  random  variable  itself 
like  the  dependent  variable,  y.  For  example,  if  one  "factor"  in  the  data 
layout  is  "time  of  the  day"  (with  levels  8  a.m.,  12  noon,  2  p.m.,  cay),  and 
if  one  covariate  is  air  temperature,  the  factor  "time  of  the  day"  will 
almost  certainly  appear  to  have  an  influence  upon  this  OCIV. 

If,  in  this  example,  the  response  variable  y  is  a  true 
function  of  the  time  of  day  (independent  of  air  temperature)  the  factor 
"time  of  day"  will  appear  to  have  a  significant  effect  also  upon  y.  There¬ 
for,  both  the  ANVAs  for  y  and  the  OCIV  (temperature)  will  show  the  effect 
"time  of  day"  to  be  significant.  If  both  variables  are  analyzed  in 
combination  in  an  analysis  of  covariance  model,  and  if  temperature  exercises 
an  additional  effect  upon  the  response  variable,  the  OCIV  "temperature" 
may  be  significant  (in  COMO);  and  If  kept  in  the  model  through  the  FEMO 
ranking,  may  cause  the  effect  "time  of  day"  to  be  non-wigntficant  with 
respect  to  y.  In  other  words,  in  performing  an  analysis  of  covariance 
alone  there  is  the  possible  danger  of  not  detecting  the  significance  of 
a  factorial  effect.  Performance  of  the  ANVAs  for  y  and  the  OCIVa  will 
prevent  the  indicated  danger.  Also,  the  ANVAs  will  giv^  in  combination 
with  the  analysis  of  covariance,  a  much  better  general  picture  of  the 
relationship  between  the  variables  concerned  than  the  analysis  of  covariance 
results  alone  could  give.  See  also  Example  6  in  Section  3.^-6. 

The  final  comprehensive  printouts  of  the  ANVAs  are 
complemented  by  listings  of  the  admissible  effect#  at  each  step  of  an 
ANVA  and  the  associated  l(X)-values  and  their  arguments-  These 
complementary  printouts  serve,  as  they  do  in  the  other  printouts,  to 
inform  the  analyst  how  the  ranking  of  the  factorial  effects  was  actually 
performed. 


Les  cf  Application 


In  this  section,  7  examples  of  application  of  the  NOVACCM 
program  are  discussed.  Since  it  is  not  possible  to  show  all  features  of 
the  program  in  the  printout  of  one  single  example,  those  parts  of  the 
printout  of  the  examples  are  reproduced  which  show  features  not  exemplified 
in  other  examples.  For  Example  1,  the  complete  printout  is  reproduced. 

For  some  examples,  only  the  Final  Comprehensive  Analyses  (FCAs)  are 
reproduced. 


Following  is  a  list  of  the  headings  of  the  7  examples; 
Example  1:  Multiple  Regression  (Duncan,  1959) 

Example  2;  Half  Replicate  of  2x2x2x2  Factorial  (Davies,  1956) 
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Example  3:  3x4  Factorial  (Hicks,  1964,  and  Robscn,  1959) 

A:  Restricted  Admissibility,  Uncoded 

B:  Unrestricted  Admissibility,  Uncoded 

C:  Unrestricted  Admissibility,  Coded 

Example  4:  2x4x4  Factorial,  5  cells  empty  (Stevens,  1948) 

Example  9:  3x3x2  Factorial,  5  cells  empty,  3  constants  confounded 

Example  6:  3x3x3  Factorial,  9  cells  empty,  with  3  OCIVs,  2  dependent 

variables 

Example  7-  3*3x3  Factorial  (Example  6  modified)  with  singularity  in 

matrix  A. 

The  areas  of  application  exemplified  are  as  follows: 

I.  Multiple  (polynomial)  regression:  Example  1. 

II.  Analysis  of  variance  fbr  orthogonal  data  layouts:  Examples  2, 
3A,  3B,  3C. 

III.  Analysis  of  variance  foi  non-orthogonal  data  layouts  without 
confounding:  Example  4. 

IV.  Analysis  of  variance  for  non-orthogonal  data  layouts  with 
confounding:  Example  5« 

V.  Analysis  of  covariance  for  non-orthogonal  data  layouts  with 
confounding:  Example  6. 

The  various  features  of  the  program  are  illustrated  as  follows: 

1.  Complete  printout:  Example  1. 

2.  Multiple  dependent  variables:  Example  6. 

3-  Model  generation  options  (not  all  possibilities  illustrated): 

CIVs,  all  automatically  generated:  Example  1. 

CIVs,  automatically  generated;  and  deleted  via  CC  No.  5: 

Example  6 . 


DIVa,  ail  hand-generated  via  CC  No.  4:  Exaaiplea  6,  7. 

DlVs,  all  automatically  generated:  Example  3. 

DIVs,  automatically  generated;  and  hand-generated  via  CC  No.  4: 
Examples  2,  4. 

CIVs,  automatically  generated;  and  deleted  via  CC  No.  4: 

Example  5. 

4.  Types  of  factors: 


All  qualitative:  Examples  2,  4,  5. 

All  quantitative:  Example  3. 

Both  qualitative  and  quantitative:  Exanjplea  6,  7. 
5-  Coding:  Examples  3C,  6,  7. 

6.  Admissibility  for  ranking: 

Restricted:  Examples  1,  3A,  6,  7. 
Unrestricted/Relaxed;  Examples  3B,  3C. 

7-  ANVAs:  Example  6. 


.  4  -1  ex“"Ple  problems  were  run  with  the  option  for  both  cumulative 

and  single  dropping  and  in  all  examples,  except  No.  1  and  No.  2,  3  signifi¬ 
cance  levels  o  were  specified  (1  only  i^  Exas£le.  1  and  2)  ^  1  1 

3-4.1  Example  1 

NOVACCW  In  exanI>1®  ia  exhibited  in  order  to  show  the  capability  of 

NOVACCM  in  multiple  regression.  The  example  also  serves  to  illustrate  the 
entire  printout  of  the  program  for  this  case.  The  exhibit  of  the  complete 
*  COBipilriBOn  with  the  treatment  of  the  problem  bv  DA-MRCA 

tf  SwSS.  ’■  daU  “**  'aso  '“M  “  “  in  the  docunentatloi 


two  OCIVs  (•*  p?  + 1  lj8vtaken/r0,n  Duncan  C1^],  P^  e  697.  There  are 
OCIVs  (xx  -  Plate  Thickness  in  Inches,  and  x2  =  Brinnell  Hardness 

Number)  and  one  dependent  variable  (y  =  Ballistic  Limit  in  Feet/Sec.). 
tomN_o  CrV  I?  order  in  *j.  and  xE  is  automatically  generated  which  leads 
to  N-9  CIVS.  The  number  oi  data  points  is  n=20.  See  the  reproduced  input 
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aheet  for  thia  example.  Following  the  Input  aheet  la  the  reproduced 
printout  from  NOVACCM  on  which  are  written  the  numbers  of  the  notea  which 
follow  below.  The  reader  la  alao  referred  to  the  notea  on  the  flowcharta 
In  Section  2.k.2  which  explain  many  of  the  featurea  of  the  complete 
printout. 

Notes  on  (complete)  printout  Example  1. 

(References  to  "boxes"  are  to  the  coianents  in  Section  2.4.2.) 

Note  1.1.  The  entries  on  Control  Card  No.  1  are  printed  for  identifi- 
cation  purposes. 

Note  1.2.  The  2  CCIVa  ("IV  1"  and  "IV  2")  and  the  7  autcmtically 
generated  OCIVa  (P»3,  CC  No.  1,  columns  14-15)  are  identified.  8ee 
Section  2.2.1  for  the  notation  used. 

Note  1.3.  The  data  Input  la  printed  (xi,Xa,y).  See  comments  on 
Box  1.1. 

Note  1.4.  The  maximum  and  minimum  value  for  each  OCIV  iu  given  plua 
the  rainge  and  the  interval  size  ("DELTA")  for  the  frequency  bar  charts  of 
the  OCIV  values.  See  comments  on  Box  1.1. 

Note  1.5.  The  "FULL  DATA  MATRIX"  contains  the  values  of  the  9  CIVs 
(2  OCIV 3  and  7  GCIVa)  and  of  the  dependent  variable.  The  horizontal  and 
vertical  marginal  identificatlona  give  the  IV  numbers  and  the  observation 
munbera,  respectively.  See  comments  on  Box  1.3. 

Note  1.6.  The  "SUWATICH  MATRIX"  la  the  (N+2)x  (N+2)  -  matrix  composed 

of  the  (N+l)  x  (N+l)  matrix  (A)  of  the  coefficients  of  the  normal  equations, 

•  . 

i.e.,  of  the  terms  ^XyjXy.  ,  *  v  ■  0,1,. . .  ,N,  (n-9  here),  with  x0i  ■  1;  and 
of  the  N+2  terms  (only  the  row  vector  printed)  x^y,  and  fc  y? •  The  two 
marginal  identificatlona  give  the  IV  numbers,  v  =  1,...,9>  feee  comments  on 
Box  1.3. 


Note  1.7.  The  averages  of  the  values  of  the  N«9  CIVs  and  of  yx  are 
.  —  1  ao  _  1  so 

printed.  (For  example,  xx  -  _  I  Xj,  =  0.249799  ..,  and  y  =  *w  I  y.  ■  1179- 15- 

cU  1  “ \  |B  \  * 


Note  1.8.  This  is  a  printout  of  the  IV-nuabers  of  the  admissible 
CXVa  at  the  first  step  r  CCMO.  See  Note  1.2:  under  restricted  admissibility, 
only  xi,  xix2,  XxX^,  ar  are  admissible  for  ranking  at  the  first  step. 

Note  1.9.  The  regression  coefficients  are  always  printed  at  the  first 

step. 
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Note  1.10.  The  I(X)-vwlue  corresponding  to  the  CIV  (IV  No.  6:  jcJ.) 
ranked  lea*t  important  at  the  first  step  of  CCMO  ie  printed  together  with 
It*  arguments:  ARC  1  ■  %fa*5  and  ARG  2  -  ^fi«0.50.  (See  alio  formula 
(2-12)  in  Section  2.1.2.; 

Note  1.11.  The  printwt  identification  indicates  whether  this  ia  a 
CCMO  or  a  FZMO  ranking;  the  number  of  the  dependent  variable  is  given 
("Yl"  if  there  is  only  one  dependent  variable  as  in  the  present  example); 
the  ranking  option  is  indicated:  CUMUL  «  cumulative  dropping;  SINQLE  = 
single  dropping;  and  the  number  of  the  SET  of  Control  Cards  No.  4  is  given. 

If  there  ie  none  or  one  Set  of  CC  4,  or  if  the  problem  is  one  of  multiple 
regression  (as  is  the  case  here))  "SET  1"  is  printed  here.  (See  also  the 
identification  lines  at  previews  places  of  the  present  example:  "SET  1" 
is  printed  everywhere,  whereas  the  other  3  spaces  are  left  blank  when  not 
applicable. ) 

Note  1.12.  The  inverse  matrix  A"1  is  printed  for  the  first  step 
of  COMO  where  the  model  contains  all  N-9  IVs. 

Note  1.13.  The  main  diagonal  elements  of  the  computed  identity 
matrix  A"1 A  are  not  printed  since  all  deviations  frem  1  were  smaller  than 
TOLI(2)  «  0.001  which  was  used  as  input  value. 

Note  1 .14.  Following  are  the  printouts  associated  with  the  Chi-square 
computations  for  the  normality  test  of  the  residua-  .  These  printouts  are 
always  given  for  the  first  (good)  step. 

Note  1.15.  Admissible  for  ranking  at  the  second  step  are  CIVs  Nos.  7, 

0  and  9;  that  is,  after  dropping  frem  the  model,  no  CIV  became  additionally 
admissible  for  ranking.  Following  are  the  ranking  informations  for  steps 
through  7  (see  Notes  l.ti  and  1.10)  and  the  values  of  the  regression 
coefficients  for  each  step  because  IRCOl  in  column  45  of  CC  No.  1. 

Note  1.16.  At  Step  Number  7,  I(X)  reaches  for  the  first  time,  the 
first  specified  a-level:  QfX  -  0,05,  see  columns  26-29  of  CC  No.  1.  The 
"full  printout"  for  this  step  follows;  sec  the  comments  on  Box  7.4. 

*‘17 ■  The  statement  "DEVIATIONS  OF  MAIN  DIAGONAL  IDENTITY  MATRIX 
E  LHC  NTS  IESS  THAN  .001"  is  printed,  however,  the  actual  computational  check, 
in  this  example,  was  done  for  the  first  step  only.  Once  the  model  of  a 
step  has  been  accepted  by  NOVACCM,  the  accuracy  checks  are  not  performed 
anymore  after  that  step.  See  also  Section  2,1.3. 

Note  1,18.  The  "RE3UUAL  OR  ERROR  SUM  OF  SQUARES",  at  this  point  of 
the  rull  printout"  is  defined  as  ATSS-ASSR(N ' ),  with  N »=9-(7-i) -3  in  the 
present  example.  That  is,  should  the  analyst  decide  to  use  the  model  i>:‘ 
this  step  (the  "significant  model"  at  a  =  0.05,  containing  3  CIVs)  as  the 
prediction  model,  while  pooling  all  non-significant  CIVs  into  the  error, 
the  value  printed  is  the  pooled  error  sum  of  squares.  The  "CHECK  ERROR 
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SUM  OF  SQUARES"  ia  the  sun  of  the  squared  of  the  prediction  errors  given 
further  below  (see  Note  1.19).  It  serves  as  an  additions!  check  on  the 
computational  accuracy.  The  "SQUARE  ROOT  OF  (the)  RESIDUAL  VARIANCE"  is 
the  estimated  error  standard  deviation,  a,  for  this  step  based  on  the  Error 
Sum  of  Squares  discussed  above.  The  value  a  ia  used  in  the  computations  of 
the  standard  deviations  of  the  regression  coefficients  given  later  in  the 
full  printout  (see  Note  1.20).  See  also  cosments  on  Box  7.4. 

Note  1.19.  The  "PREDICTED  VALUES",  the  "PREDICTION  ERRORS" ,  and  the 
subsequent  Chi-squarc  computations  are  baaed  on  the  model  of  this  atep, 
that  io,  on  the  model  containing  the  3  significant  ClVa.  See  Note  l.l8. 

Note  1.20.  The  "STANDARD  DEVIATIONS  OF  THE  REGRESSION  COEFFICIENTS" 
are  the  values  r  /cyV  ,  v  =  0,1, 2, 4  in  the  present  example,  where  a  is  the 
standard  deviation  of  this  step,  see  Note  1.18. 

Note  1.21.  The  last  9  lines  give  the  information  on  the  I(X)-computations 
for  C(MO,  single  dropping. 

Note  1.22.  The  FCA  for  COMO,  cumulative  dropping,  shows  that  the 
significant  model,  at  the  a  ■  0*05  level,  baaed  on  this  dropping  procedure, 
contains  x^,  x1(  and  xa  (in  their  order  of  ranking,  with  x2  being  the  most 
important  CIV).  See  also  Section  5*3 -1  on  the  interpretation  of  the  FCA 
results.  A  comparison  with  the  treatment  of  the  saaw  problem  by  DA-MRCA 
(Reference  2)  shows  the  same  significant  model  obtained  by  NOVACCM  and  by 
the  BIVOR  option  of  DA-MRCA- 

Note  1.2J.  The  significant  model,  at  the  cr  -  0.05  level,  resulting 
from  single  dropping  is  the  asms  as  the  one  resulting  frem  cumulative 
dropping. 

Note  1.24.  The  indicated  problem  running  time  is  28  seconds.  This 
is  approximately  the  same  time  which  the  problem  took  when  analyzed  by 
DA-MRCA. 
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5.4.2  Example  2 

Example  2  la  to  show  the  capability  of  NOVACCM  in  the 
analysis  of  variance  for  incomplete  but  balanced  data  clossificationa. 
This  capability  is  demonstrated  with  one  of  the  simplest  cases  possible, 
namely  that  of  a  half-replicate  of  a  2* -factorial  experiment.  The  data 
is  taken  fran  Davies  [19rj6]  P-  455 .  The  layout  of  the  8  observation*!  is 
given  in  Table  3-1.  (The  cell  numbers  are  indicated  in  the  upper  left 
corners  of  the  cells. )  Cn  the  reproduced  input  sheet  for  this  example 
note  that  only  2  of  the  3  two- factor  interactions  have  been  fitted  in 
order  to  provide  the  minimum  of  1  degree  of  freedom  for  error  in  the 
FEMO  ranking.  The  two  interaction  effects  fitted  are  Gft  and  their 
respective  aliases  CS  and  8B  could  have  been  fitted  as  well.  The  four 
main  effects  are  automatically  generated  whereas  the  two  interactions 
are  written  on  two  CCs  No.  4,  that  is,  they  have  been  "hand"-generated. 


£ 

JBx 

<L 

■Bs 

C- 

JBx 

2 

B2 

8x 

1111 
_ 107 

1122 

10t; 

Ox  — 
82 

|Bi 

1212 

122 

1221 

120 

m 

IBI 

2112 

114 

2121 

121 

2211 

130 

2222 

132 

Table  5-1 

Data  Layout  Example  2 


A  partial  reproduction  of  the  printout  of  this  example 
is  given  in  order  to  show  features  which  could  not  be  shown  with  Example  1. 
The  notes  on  these  features  follow  belou. 

Notes  on  printout  Example  2. 

Note  2.1.  The  numbers  printed  here  are  programing  information  on 
the  admissibility  of  effects;  see  Herring  [1967]. 
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Note  2.2.  The  DIVs  and  the  effects  are  identified.  In  the  present 
case  che  number  of  DIVs  equals  the  number  of  the  effects  since  all  factorial 
effects  are  one-degree-of- freedom  effects. 

Note  2.3.  For  a  crossed  data  classification  the  FULL  DATA  MATRDC  also 
contains  the  cell  identifications.  The  numerical  values  of  IVs  No.  1 
through  6  are  the  design  point  coordinates  which  values  are  either  1  or  0. 

Note  2.U.  At  the  first  step  (of  FSMO),  effects  No.  5,  and  6  are 
admissible  for  ranking,  that  is,  JS,  (33;  and  CC,.  <3,  8,  and  C  are  sub-effects 
of  (39  and  c&  and,  therefore,  are  not  admissible  at  the  first  step. 

Note  2.5 .  The  three  l'(X)-values  corresponding  to  the  three  effects 
4,  5,  and  6  are  printed.  The  second  l(X)-value,  that  is,  the  l(X)-value 
corresponding  to  effect  No.  5,  is  the  largest  one.  Therefore,  effect 
No.  5  ((39)  is  ranked  as  the  least  important  and  deleted  from  the  model. 

With  (33  deleted  free,  the  model,  effect  No.  2  (©)  becomes  admissible  for 
ranking  in  addition  to  effects  No.  k  and  6  at  the  second  otep. 

Note  2.6.  The  last  5  lines  of  the  l(X)-printouts  are  the  first 
five  of  FEMO,  single  dropping.  Note  that  the  last  I(X)-value  is  smaller 
than  ALRIA(l)  *  0.05.  Therefore,  a  full  printout  is  given  for  step  No.  5 
of  FEMO,  single  dropping. 

Note  2.7.  The  FCA  for  FEMO,  cumulative  dropping,  shows  that  there 
are  no  significant  effects.  However,  one  must  not  forget  that  this 
conclusion  is  based  on  only  1  degree  of  freedom  for  error.  For  the 
orthogonal  data  layout  of  this  example,  the  DIFF  M3  -  column  shows  the 
mean  squares  given  by  Davies  [1 956],  p-  456. 

Note  2.8.  The  FCA  of  FEMO,  single  dropping,  does  show  a  significant 
model  which  contains  the  two  main  effects  (7  and  8,  with  8  being  the  effect 
ranked  moat  important.  By  covicus  reasons  in  the  present  example,  which 
is  exhibited  for  the  purpose  mentioned,  it  does  not  make  sense  to  try  to 
close  the  gap  between  the  results  of  the  two  ranking  procedures. 

Note  2.9.  Problem  running  time  was  2k  seconds.  This  is  relatively 
long;  however,  one  must  not  forget  that  a  program  like  NOVACCM  is  bound 
to  be  inefficient  (timewise)  for  a  small  case  as  the  present  one. 
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3.4.3  Example  3 


Example  3  is  exhibited,  in  order  to  illustrate  the  effect  of 
coding  and  of  the  restricted  admissibility  rules  upon  the  determination  of  ' 
a  significant  model  when  there  are  quantitative  factors.  At  the  same  time, 
the  example  demonstrates  the  applicability  of  NOVACCM  in  analyzing  orthogonal 
date,  layouts  when  the  quantitative  factor  levels  are  n on-equidistant . 
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10 

9 
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14 
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13 

E  =19 

E  =13 

E  =24 

E  =22 

Table  j.H 

Data  Layout  Example  3 


The  data  as  exhibited  in  Table  3.2  is  taken  from  two 
sources:  the  24  values  of  the  response  variable,  y,  are  from  Hicks  [1964], 
p.  129,  and  the  quantitative  factor  variable  values  are  from  Robson  [1959]. 
(The  totals  for  each  cell  are  given  for  purposes  to  be  seen  later. ) 

The  data  in  the  3x4  classification  with  both  factors  <3 
and  8  being  quantitative  (leading  to  a  breakdown  into  11  one-degree-of- 
freedom  components  of  the  sum  of  squares  between  cells)  is  treated  in 
three  different  ways: 
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A  (Example  3A):  factor  levels  uncoded;  restricted  admissibility 
in  the  ranking  of  factorial  effects. 

B  (Example  3B):  factor  levels  vncoded;  unrestricted  admissibility 
in  the  ranking  of  factorial  effects. 

C  (Example  3C):  factor  levels  coded;  unrestricted  admissibility  in 
the  ranking  of  factorial  effects. 

The  resulting  significant  models  are  different  in  all  three  cases  as 
will  be  discussed  with  the  reproduced  printout  of  the  FCAs.  Besides  the 
FCAs  again  only  those  parts  of  the  actual  printout  are  reproduced  which 
show  features  not  exhibited  in  tie  two  previous  examples. 

Notes  on  printout  Example  3A. 

Note  3. 1.  The  level  values  (values  of  the  ouantitative  factor 
variables)  are  identified. 

Note  3.2.  The  quantitative  factorial  effects  (all  with  1  degree  of 
freedom)  are  identified. 

Note  3.3.  Chly  effect  No.  11  (<7.u*4r.  M,]  is  admissible  for 

ranking  at  the  first  step  since  all  other  effects  are  sub-effects  of  this 
effect . 

Note  3.4.  The  l(X)-value  corresponding  to  effect  No.  5  (/9eubl0)  is  the 
larger  one  among  the  two  computed  at  this  step  so  that  this  effect  is  being 
dropped  fran  the  model  at  this  step  (No.  8).  Since  the  I(X) -value 
corresponding  to  9euMo  is  also  smaller  than  ALB!A(e>  •  0.01,  a  full 
printout  for  this  step  is  given. 

Note  3.5.  Step  No.  7  in  "single  dropping"  yields  an  I(X)-value 
which  is  smaller  than  ALTHA(2)  ■  0.01.  Sine"  no  full  printout  for  step 
No.  7  had  been  given  before,  it  is  given  here. 

Note  3.6.  With  restricted  admissibility  in  the  FIWO  ranking,  the 
"DIFF  MS"  column  in  the  FCA  shows,  for  this  orthogonal  case,  the  breakdown 
of  the  sum  of  squares  between  cells  into  orthogonal  components  as  one  would 
obtain  them  by  application  of  orthogonal  polynomials. 

The  value  DIFF  MS  =  5-21+57707  for  £74Ulklr,  x  ^ lB„r  (Step  5  of  the 
ranking)  may  be  checked  in  employing  the  coefficients  given  by  Robson  [1959], 
ir.  his  "Table  4 .  *'  The  sum  of  squares  (1  degree  of  freedom)  due  to  the 
component  Q*  I*  is  computed  as  follows,  using  the  totals  from  Table  3.2 
given  earlier: 
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-15-30 
+25-30 
-10- 19 

-  9-23 
+15-31 

-  6-15 
♦  3-36 

-  5-33 
+  2-24 
+21-24 
-35-26 
+14-22 


Sum  =  I83 


Sum  of  squared  coefficients  =  3192 

Sum  of  squares  due  to  "^1*"  ■  2^319^  *  5-245770?. 

Note  that  the  significant  model,  at  the  a  ■  0.05  level,  contains  all  11 
effects. 


Note  3.7.  The  FCA  for  the  single  dropping  procedure  shows  the  same 
significant  model  at  the  0.05  level  as  did  the  FCA  for  cumulative  dropping. 
For  the  0.01  significance  level  there  is  a  gap  between  the  two  models. 

This  gap  may  be  closed  by  dividing  DIFF  MS  =  36.266447  of  Step  7  in 
cumulative  dropping  by  MS(2)  =  2.875  to  give  a  value  of  F  =  12.614  which, 
with  1  and  12  degrees  of  freedom,  is  significant  at  ALHJA(2)  =  0.01. 


Note  3.8.  This  is  the  FCA,  F1M0,  cumulative  dropping,  for  Example  3B. 
Due  to  the  unrestricted  admissibility  in  the  ranking,  the  ranking  order  is 
different  from  that  obtained  in  Example  3A.  Note  the  differences  also  in 
the  values  of  DIFF  M3  in  this  orthogonal  case:  none  of  the  DIFF  MS- values 
is  equal  for  Example  3A  and  3B.  The  significant  model  is  equal  for 
cumulative  and  single  dropping  at  both  levels  ALPHA (3  )  =  0.05  and 
ALMA(2)  =  0.01.  Note  that  at  the  0.01  level  of  significance,  the 
significant  model  as  defined  by  the  ranking  now  contains  the  effect 

<*ly  However,  dividing  DIFF  MS  «  39-272801  for  ®  4r .  (Step  9) 
by  ®(2)  =  2.875  yields  F  ■  13.660  which  is  also  significant  at  the  0.01 
level . 


Note  3.9.  Example  3C  differs  from  Example  3B  only  by  the  coding  of 
the  quantitative  factor  level  values.  For  instance,  the  three  levels  of 
factor  Ot  i.e.,  =  0,  X^_  =  2,  and  X,3  -  5,  with  Range  =  5-0  =  5,  and 

with  average  =  (0  +  2  +  5)/3  =  2.33,  are  in  coded  form:  -0.4667,  -O.O667, 
-+O.5333  -  The  ranking  is  again  different  (from  those  of  Examples  3B  and  3A) 
and  the  significant  models  contain  even  fewer  terms  than  in  Example  3B.  The 
gap  between  the  significant  models  of  cumulative  and  single  dropping  at  the 
0.05  level  may  be  closed  by  dividing  DIFF  MS  =  17.402285  of  ij  4r#  (Step  6) 
by  MS ( 2 )  =  2.875,  leading  to  F  =  6.053,  which  is  significant  at  ALHIA(l)  -  0.05, 
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3.4.4  Example  4 

Example  4  Is  the  first  in  this  series  of  examples  of 
NOVACCM  applications  which  deals  with  an  incomplete  and  unbalanced  data 
layout.  The  data  is  that  of  the  example  treated  by  Stevens  [1948]  and 
the  layout  of  the  values  of  the  response  variable  y  =  "gain  in  weight" 
is  given  in  Table  3-3  in  slightly  different  arrangement  than  given  in 
"Table  1"  on  page  349  of  the  Stevens  paper.  The  factor  symbols  used 
correspond  to  the  3  (qualitative)  factors  as  follows: 

Factor  Sex  {(Ji  =  "M",  £7a  =  "F") 

Factor  8:  Type  of  wheat  in  diet  (8i  =  "A",  82  a  "B",  83  =  "C". 

®4  =  "D") 

Factor  Litter  (fl*  =  "I",  <3e  =  "II",  Cg  =  "HIB,  C*  =  " IV") 


£1 

Ce 

(7a 

C4 

(Si 

81 

43 

58 

73 

59 

81 

67 

82 

93 

33 

75 

8? 

101 

100 

s3 

91 

85 

92 

88 

106 

84, 

£9 

89 

98 

105 

108 

109 

az 

81 

33 

62 

71 

Bs 

60 

71  “ 

76 

83 

70 

70 

58 

73 

8* 

72 

76 

Table  3. 3 

Data  Layout  Example  4 


The  fitting  of  constants  follows  the  rules  given  in  Appendix  A.  All 
main  effects  and  all  two-factor  interactions  can  be  fitted,  however, 
because  of  the  five  empty  cells,  only  4  constants  for  the  #33,  interaction 
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can  be  fitted.  This  interaction  would  be  represented  by  9  constants 
(9  degrees  of  freedom)  in  a  "full  model."  There  are  no  "identities" 
for  this  layout,  that  is,  there  is  no  confounding  among  the  factorial 
effects.  The  fitting  process  for  the  4  tKree-factor-interaction  constants 
is  illustrated  in  Table  3-4  where  the  types  of  checkmarks  explained  in 
Appendix  A  are  used.  The  four  circled  "X's"  indicate  the  four  constants 
fitted:  abclxl,  abc113,  abc121,  and  abc13£.  (Other  sets  of  4  constants 
could  have  been  chosen  for  the  <5W3  interaction. ) 


Ca 

c* 

— 

I 

Si 

© 

© 

\X 

Bz 

© 

\X 

vX 

& 

S3 

VX 

© 

V 

\X 

3* 

\ X 

%/ 

flz 

Si 

vX 

\X 

Bz 

\X 

%x 

%x 

s3 

\X 

\ x 

s4 

*x 

Table  3.4 

Fitting  of  (3BS  Constants  in  Example  4 


The  model  containing  26  DIVs  was  fitted  by  generating 
automatically  the  full  model  of  order  Ii=2  (CC  No.  1,  columns  4-5)  and 
by  adding  the  four  ^C-constants  via  CC  No.  4;  see  the  reproduced  input 
sheet. 


Of  the  NOVACCM  printout  for  Example  4  only  the  two  FCAs 
are  reproduced.  The  cumulative  dropping  procedure  results  in  a  significant 
model  (at  the  level  ALPHA(2)  =  0.05)  containing  only  the  main  effects  of 
(1  and  3-  The  single  dropping  procedure  results  in  a  model,  at  again  the 
0.  Op  level,  containing  the  effects  (J ,  B,  C-,  and  Cf3-  Dividing 


DIFF  MB  »  197-28205  of  ce  In  cumulative  dropping  (Step  4)  by  MS(2) =49-722222 
yield#  F*»3-96B  which  with  3  and  9  degrees  of  freedom  is  significant  at 
a  m  0.05.  Therefore,  the  gap  between  the  two  significant  models  can  be 
closed,  and  the  conclusion  would  be  that  the  significant  model,  at  the  0.05 
level,  contains  the  three  main  effects  (with  7  degrees  of  freedom)  and  the 
<7  x  S  interaction  (with  3  degrees  of  freedom).  The  ranking  order  within 
the  significant  model  shows  that  O  and  &  are  of  approximately  equal 
importance,  wherea*  (2  and  09  are  less  important  with  09  being  marginally 
significant. 


The  above  conclusions  are  essentially  those  which  are 
reached  in  the  analysis  by  Stevens  ("Table  12"  on  page  365  of  the  paper). 
However,  Stevens  separated  one  degree  of  freedom  from  the  09- interaction 
which  enabled  him  to  allocate  the  significance  of  09  to  this  one  degree 
of  freedom.  In  NOVACCM,  the  split-up  of  qualitative  factorial  effects 
into  s ingle-degree-of- freedom  contrasts  is  not  possible. 

Table  12  of  the  Stevens  paper  also  allows  a  comparison  with 
the  sums  of  squares  obtained  by  NOW! CM. 

The  following  table  of  values  is  computed  from  the  FCA, 
cumulative  dropping,  columns  "M8(l)"  and  "DF(l)." 


Step 

SS(l)-MS(l)xDF(l) 

Due  to 

Stevens 

Value 

7 

9272.50 

26 

all  effects 

9272 

4 

1401.07 

19 

all  interactions 

— 

Difference 

7871.43 

7 

3  main  effects 

7871 

7 

9272.50 

26 

all  effects 

9272 

3 

809.22 

16 

02., SC,,  092 

813 

Difference 

3463.28 

10 

a, B, (2,09 

8459 

4 

1401.07 

19 

all  interactions 

— 

.3 

809.22 

16 

02,92^792 

813 

as 


Difference 


591.85 


3 


508 


The  discrepancies  between  the  values  obtained  by  Stevens  and  those  by 
NOVACCM  are  small  and  may  be  attributed  to  the  lesser  accuracy  of  the 
computational  procedure  employed  by  Stevens.  Note  that  the  Stevens 
value  538  for  the  sum  of  squares  due  to  OB  is  the  sum  of  the  values 
1*62  and  126  in  his  Table  12. 
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Example  5  ia  a  numerical  illustration  of  Example  E  (a  3x3x2 
factorial)  which  is  treated  in  general  t»nns  in  Appendix  A.  Example  5 
serves  to  illustrate  the  capability  of  NOVACCM  in  analyzing  incomplete 
and  unbalanced  data  classifications  when  there  is  confounding  among  the 
factorial  effects. 

Hie  numerical  data  of  the  example  has  been  generated 
according  to  the  following  model  in  which  all  effects  involving  factor  (3 
are  absent: 


yaetp  c  +  e  „pYp  =  m  +  So  +  +  a^f  +  <*  oIyp» 

where  e^Byp  ~  NID(0>1)  ^nd  where,  according  to  the  structure  of  Example  E, 
a  ■  1,2,3 }  0  *  1,2,3;  y  ■  1>2.  Actually,  therefore,  one  deals  with  a  two- 
factor  classification  containing  a  dummy  third  factor,  Q,.  Consequently, 
the  ranking  process  is  expected ^to  yield  a  significant  model  containing 
only  the  constants  ^ ,  and  afc^  j • 

In  the  construction  of  the  data  the  following  values  were 
assigned  to  the  model  constants: 


m  = 

13 

ai  = 

h 

bi 

ag  * 

11 

bg 

ablx  = 

5 

abgi  - 

-19 

aboa 

With  these  values,  the  following  Table  3*5  of  expected  cell  means,  XI #Y , 
and  actual  "observations",  yaPYQ  =  T^Br  +  ^Byp'  has  toeen  constructed, 
using  a  table  of  random  normal  deviates  with  o=l.  (See  also  Figure  5a 
in  Appendix  A.)  For  example.  Km  «  13  +  4  -  3  +  5  =  19,  and  emj.  -  0.8. 
Note  also  that  repeated  observations  yaPY  p  have  been  included  in  5  cells 
which  will  provide  an  estimate  of  0s  =1  based  on  5  degrees  of  freedom. 


Ox 

9a 

Ba 

<7i 

(Y  m  -19) 
ynii-19.8 

(Yl31  -17) 
yisn-15.8 

Ca 

(Yii£  -19) 
ynai-lfi.2 
yii£s-20.7 

(Yiaa  -17 )_ 

yiaai*^  •“ 
yiste-16.1 

<■ t a 

Cl 

(Yan  -2) 

yam-2'1 

(Y£2i  -35) 
y22ii-3*»-7 
yeaiE-35*3 

Cb 

(*aas  -35) 
y».««i"35  -6 

(Y»a  -21*) 

yas«i“2lt.o 

a* 

Ci 

(Y31i  -10) 
ysm-io.i+ 
yaiiB-lO-S 

0«ax  -13) 

>33 x  *'13  ■  0 

(Yaia  -1°) 
y3iai*12-8 

_ 

(¥322  -21) 

>3221=21.6 
_ y-naa3°20-3 

(Y33fi  -13) 
ysaai-U-i* 

Table  3.5 

Data  Layout  5 


Beoauae  of  the  confounding  la  the  given  data  layout,  there  are  three 
possible  models  upon  which  the  ranking  process  can  be  based,  as  is 
described  in  Appendix  A.  All  three  models  have  in  common  the  following 
part: 

-  to  +  aixi  +  aax2  +  biXa  +  baX4  +  cix5 

+  ®bnXe  +  ab2aX7  +  »cuxj  +  bcj^x®  ♦  abc  mxjo- 

Each  model,  in  addition  to  Y*0),  contains  two  more  constants,  namely 
a  pair  from  the  three  confounded  constants  ab2l/  ac-a>  and  beai.  Thus 
the  three  models  are  defined  as  follows: 


Model  I  :  Y<*>  •=  Y<0>  +  a^x^J*  ♦  ' 

Model  IX  :  Y<II)  «  Y*°>  ♦  ah.^!0  ♦  be,*  *ttX> 

Model  III:  Y<,IX)  -  Y<0>  +  •c11j4Ixx>  ♦  bc,*j4i»> 

The  three  models  are  generated  by  NC7/ACO!  as  follows.  A  third  order 
model  (1>*5,  columns  4-5,  CC  No.  1)  Is  generated  automatically  fro®  which, 
in  each  case,  5  DIVb  are  deleted  via  CC  No.  4.  The  3  respective  sets  of 
CC  No.  4  have  in  common  the  4  DIVa  ab1B,  abc18i>  abcan,  and  abc2fil.  In 
addition,  CC  4  Set  No.  1  contains  the  constant  be**,  Set  No.  2  contains 
acai,  and  Set  No.  3  contains  abB1,,  corresponding  to  the  three  models 
defined  before. 

Since  Model  III  contains  two  constants  (acm  and  bcBi) 
representing  interactions  with  the  duamy  factor  this  model  must  be 
expected  to  yield  improbable  results  because  the  effect  appropriately 
measured  by  the  constant  abu  is  assigned  to  degrees  of  freedom  associated 
with  OS  and  flC~  Models  I  and  II,  however,  should  yield  the  proper 
significant  model  since  both  contain  tha  constant  abB1. 

The  aamsqptloas  aie  verified  by  the  results  of  the  ranking 
processes  as  shown  in  the  FCAs. 

In  this  example,  only  the  "Final  FCA"  is  reproduced  which 
combines  the  individual  F^Ae  given  for  the  three  sets  of  CC  4,  that  ie, 
for  the  three  models. 

In  practice,  the  user  of  N WACOM  does  not  know  which  CC  4 
Set  will  yield  the  proper  significant  model.  Ecwever,  as  was  di 
In  Section  3 -3*2,  he  may  conclude  that  the  significant  model  containing 
the  smallest  number  of  effects  is  the  proper  one.  This  model  has  been 
called  "the  most  probr.ble  significant  sKxlel." 

Looking  at  the  rankings  as  established  for  the  three  models 
(the  PCAe  for  "SET  1",  "SET  2",  and  "SET  3"),  one  can  see  that  the  first 
two  significant  models  contain  only  the  effects  8,  CI>  and  09  at  all  three 
significance  levels  a  used  as  input.  Model  III  (Set  3),  however,  leads  to 
a  significant  model  (at  or  *  0,05  in  cusailative  dropping  and  or  ■  0.01  in 
single  dropping)  containing  all  effects  except  OSS-  Therefore,  the  user 
would  conclude  that  either  Model  1  or  Model  II  waa  the  right  one  to  use 
since  both  led  to  the  same  "most  probable  significant  model." 


Although  the  problems  of  estimation  we  not  discussed  In  the 
present  report,  It  is  Interesting  to  see  how  close  to  their  true  values  the 
constants  are  estimated  In  the  significant  model  which  reads  as  follows 
for  both  Models  I  and  II  (the  printout  of  the  regression  coefficients  of 
Step  5  for  "Set  1"  and  "Set  2"  is  not  reproduced): 

\  -  1J.2  +  2.7x1,  +  10. 8x*  -  1.9x4  +  7.8x4  ♦  5.6x4  +  J.4x7  -  20.QX11. 


That  is,  one  has: 


fc  -  15.2 


A 

Ha  ° 


2.7 

10.8 


k  5.6 

abgi  ■  -20 . 0 


a 

ab»a  ■  J.h 


None  of  these  estlsMtes  is  significantly  different  frcm  the  true  values 
(which  were  listed  ewlier)  when  testing  at  the  O.O5  significance  level. 
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3.4.6  Example  6 

Among  all  examples  exhibited  in  the  present  report ,  Example  6 
shows  the  largest  number  of  NOVACCM  features  in  combination.  Example  6 
is  one  of  analysis  of  covariance  for  an  unbalanced  and  incomplete  3x3x3 
data  classification  with  2  dependent  variables  and  3  OCIVs.  The  data  is 
synthetic . 


Of  the  27  cells  of  the  layout  9  were  randomly  selected  to 
be  eidpty.  Factors  £7  and  <3-  are  quantitative,  and  factor  6  is  qualitative. 
The  quantitative  factor  level  values  are  unequally  spaced  and  their  values 
are  coded  as  are  those  of  the  3  OCIVs. 

Two  different  models  were  used  in  the  construction  of 
the  data  for  the  two  dependent  variables:  For  Yx  a  model  was  used  in 
which  <3-  is  a  dummy  factor,  whereas  for  Y2  a  model  was  used  where  <J  is  a 
dummy  factor.  The  constants  of  the  two  models  are  as  follows: 


For  Ya: 


ffl  = 

13 

al  = 

5 

bi  =  -6 

a2  = 

9 

b£  =  3 

abu  = 

2 

abx2  -  40 

ab2i  = 

-10 

For  Y2; 


m 

= 

13 

bi 

= 

20 

cr  =  5 

b2 

= 

10 

c2  =  25 

ben 

S 

1 

bc12  =  30 

bc21 

= 

-16 

bcjyg  ~  60 

(Note.  The  constant  ao2 a  was  not  needed  in  the  model  for  Yu  see  further 
below.) 


Table  3.6  shows  the  data  layout  of  the  values  of  Yx  and  Y2, 
i.e.,  of  the  expected  values  of  the  response  variables.  Also  shown  are 
the  numbers  of  repeated  observations  in  the  cells,  I^e„  ,  and  the  values 
( factor  levels)  of  the  quantitative  factor  variables  X,  and  \  . 
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II 

H 

Ce 

^8=2 

xcs=9 

■ 

81 

Rm=2  Yj.=l4 
Ya=39 

8112=1  Y1=14 

y2=sq 

8113=2  Yi=l4 
Ys=33 

82 

R12i=0 

R123=4  Yj=61 

Y2=108 

8123=0 

83 

R 131=2  Yi=l3 
Ya=l8 

R132=l  Y!=l8 

Y2“58 

R133=l  Yi=l3 

Ye=13 

d?2  =6 

81 

R2ii=3  Y  i-6 
Y2=39 

8212=0 

8213=0 

8z 

R221=l  Yi=25 
Y£-12 

R?r>r>—0 

8 223=1  Yi=25 

Ya=23 

83 

8331=1  Yi=22 
Ya=l6 

8332=3  Yi=22 
Y2=38 

8333=1  Y1=22 

y£=13 

c7c  X.3=7 

Bx 

8311=2  Yx=7 

Ya=39 

8312=1  Yi-7 
Y2=88 

8313=2  Yi=7 

Y2=?3 

82 

8321=0 

Rra  pp—O 

8323=0 

83 

8331=1  Yi=13 
Y2=18 

8332=1  Yi=13 
Y==38 

8333=0 

Table  5-6 

Data  Layout  Example  6 


Considering,  for  the  moment,  all  three  factors  as  being 
qualitative,  Table  5.7  shows  the  fitting  of  683-constants  using  the  method 
explained  in  Appendix  A.  Only  one  ^(3-constant  may  be  fitted;  abcli;i  was 
selected  from  the  two  possible  constants  (the  other  one  being  abc112). 
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1 


Cl 

Ce 

Ca 

<71 

Si 

© 

Sr 

¥ 

s3 

(7z 

Si 

V' 

Sr 

V' 

V 

Sr 

** 

V 

X? 

<73 

Si 

V 

* 

Sr 

S3 

V' 

V 

Table  3.7 

Fitting  of  d5&-Constants  in  Example  6 


Since  factors  Cl  and  C  axe,  in  reality,  quantitative  factors, 
the  fitted  constant  abcm  must  be  interpreted  accordingly.  One  can  easily 
see  that  stem  can  represent  the  interaction  between  the  component 
<7lln  x  (2\  j  #  and  factor  Q.  Therefore,  abcm  is  equivalent  to  X,bxXe  with 
1  degree  of  freedom.  In  the  terms  of  the  NOVACCM  notation  this  is  the  DIV 
1.1  x  2*1  x  3.1.  Since  this  i«  the  only  DIV  representing  the  interaction 
<393-  (containing  both  qualitative  and  quantitative  factors),  this  DIV  is 
also  a  PfTE. 


Tables  3-3  a-c  show  the  fitting  of  two-factor  interaction 
constants.  Again,  all  three  factors  are  considered  to  be  qualitative 
for  the  moment.  As  can  be  seen,  3  of  the  4  (^-constants  can  be  fitted, 
and  all  4  eC-  and  all  4  93-constants  can  be  fitted.  Since  i  6  main 
effect  constants  can  be  fitted,  one  has  6  +  11  +  1=18  constants  which 
app> as  if  they  can  be  fitted.  However,  there  are  only  27  -  9  -  1  =  17 
degrees  of  freedom  "between  cells."  Consequently,  there  must  be  one 
identity  in  the  data.  Looking  at  Table  3>7>  one  notes  at  once  that 
eliminating  all  observations  from  cell (7i Bz  also  eliminates  all  observations 
from  cell  9g2e.  Therefore,  the  identity  is: 


1 


9 


<7lQz  = 
149 


..  Tfwiw  i;.n4,:| 


Tables  5.8  a-c 

Fitting  of  the  Two-Factor  Interaction  Constants  in  Example  6 


Oie  can  fit  either  ab12  or  bc£g,  but  not  both  at  the  same  tine. 

.  .  .  In  or(*er  not  to  add  unnecessarily  to  the  amount  of  printout 

?TheefittiSUn?  he11  r  d6Ci‘Jed1t°  haVe  °nly  0ne  CC  k  set  ***  to  bc^. 
?™onn f  J'  s®. L **  the  interaction  OS,  which  is  represented  by  ' 
2  constants,  to  be  the  only  other  PFFE  in  addition  to  CBd- )  Since  C  is  a 
du»,  factor  for  Y;,  the  fitting  of  bcM  (In.t.ad  of  eho2d  lead  to 

VI rs  2?r 

sis  s  Ihese  “su,’»ti‘’"‘  - 
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The  interpretation  of  the  17  fitted  constants  for  the  real 
situation  of  factors  <7  and  C,  being  quantitative  is  not  difficult.  The 
relations  for  all  17  constants  fitted  are  as  follows: 

Fit  \  and  X®  instead  of  ai  and  a^. 

Fit  X„  and  X®  instead  of  cx  and  C2- 

Fit  X,  &ni  Xjbj  instead  of  abn  and  ab2L. 

Fit  X.X.,  x,x®,  xjx,,  X®X®  instead  of  acn,  ac18,  ac8i,  ac22. 

Fit  bjX,  and  bjXj  instead  of  ben  and  bci2- 

Fit  h^X,  and  instead  of  bc2i  and  bc^. 

Fit  X^bjX^  instead  of  abcm.. 

The  17  DIVa  were  "har.d"-generated  via  CC  No.  4;  see  the 
reproduced  input  sheet.  A  third  order  model  in  the  three  OCIVs  was 
automatically  generated  from  which  9  GCIVs  were  deleted  via  CC  No.  5, 
leading  to  a  total  of  10  covariates  (CIVb)  in  this  analysis  of  covariance 
example . 


The  residual  terms,  e,  in  the  observed  values,  y  =  Y+e,  of  the 
two  dependent  variables  were  taken  fran  a  table  of  normal  random  deviates 
with  o=l.  The  30  values  each  of  yx  and  y2  and  the  30  values  each  of  the 
three  OCIVs  are  given  on  the  reproduced  input  sheet. 

The  significant  CIVs,  if  any,  to  be  kept  in  the  model  for 
the  FEMO  ranking  are  determined  by  the  choice  of  KALHIA  =  1  and  ALHiA(l)  =  0.05. 
ALPHA(2)  =  0.01  may  be  considered  as  the  principal  significance  level  in  this 
example.  See  alBo  the  discussion  in  Section  3* 1.5-  All  rankings  in  the 
present  example  are  performed  under  restricted  admissibility  rules  (CAD  -  0 
in  column  38  of  CC  No.  1). 

The  printout  exhibited  for  Example  6  consists  of  the 
identification  of  the  IVs,  all  FCAs,  and  the  ranking  information  for  the 
ANVA  of  OCIV  1.  Following  are  the  notes  referring  to  the  printout. 

Notes  on  printout  Example  6. 

Note  6.1.  The  coded  factor  levels  (values  of  the  quantitative  factor 
variables)  are  printed.  For  example,  X,'3  -idL.  ts.o.4. 

Note  6.2.  The  identification  of  the  17  IVs  and  the  14  effects  is 
reproduced  since  an  example  of  the  identification  for  which  the  number  of 
IVs  is  different  from  the  number  of  effects  was  not  shown  previously. 
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(Also,  the  numbering  of  the  effects  is  necessary  information  if  the  reader 
wants  to  follow  the  ranging  process  in  the  ANVA  for  OCIV  No.  1;  see  Note  6.11 
further  below.)  Note  that  effects  Nos.  6  and  7  represent  the  PFFE  C/3  with 
2  degrees  of  freedom  and  that  effect  No.  14  represents  the  EFFE  (J3£  with  1 
degree  of  freedom. 

Note  fe.3.  For  ylt  only  OCIV  No.  1  is  significant  at  the  0.05  level. 

Note  that  due  to  the  fact  of  having  fitted  10  CIVs  there  are  only  2  degrees 
of  freedom  for  error:  DF(2)  =  2. 

With  OCIV  No.  1  being  significant  at  ALFHA(l)  -  0.05,  this  OCIV  will 
be  kept  in  FEMO  and,  therefore,  the  program  will  perform  the  ANVAs  for 
yx  and  OCIV  No.  1. 

Note  6.4.  COMO,  single  dropping,  yields  a  significant  model  (at 
or  =  0.05)  containing  4  CIVs.  However,  with  only  2  degrees  of  freedom  for 
error  in  CCMO,  cumulative  dropping,  it  does  not  make  sense  to  try  to  close 
the  gap  between  the  two  models.  (In  practice,  one  would  not  fit  such  a 
large  covariate-model  as  was  done  here  for  demonstration  purposes.) 

Note  6.5.  The  significant  model  for  yx  resulting  from  FEMO, 
cumulative  dropping,  contains  four  factorial  effects  involving  the  dummy 
factor  £,  as  was  predicted.  DF(2)  equals  11  after  one  degree  of  freedom 
for  OCIV  No.  1  was  subtracted  from  the  degrees  of  freedom  "within  cells." 

Note  6.6.  The  single  dropping  procedure  o."  FEMO  for  yA  results  in 
the  c''.\ie  significant  model  as  was  obtained  with  the  cumulative  procedure. 

Note  6. 7.  For  the  second  dependent  variable,  y®,  CCMO,  cumulative 
dropping,  does  not  show  any  significant  CIVs.  Therefore,  no  ANVA  will 
be  performed  for  yz  or  any  ether  OCIV  than  No.  1. 

Note  6.8.  The  single  dropping  procedure  of  CCMO  for  ya  does  show 
significant  CIVs,  however,  again  because  of  only  2  degrees  of  freedom  for 
error  in  the  cumulative  dropping  procedure,  closing  the  gap  between  the 
two  models  is  not  worthwhile  trying. 

Note  6. 9-  FEMO,  cumulative  dropping,  yields  a  significant  model  for 
y2  which  contains  effects  involving  factors  8  and  £  only,  as  was  predicted. 
That  is,  there  are  no  factorial  effects  in  the  significant  model  involving 
factor  a,  which  is  a  dummy  factor  for  y2.  The  significant  model  is  reached 
rather  abruptly  at  Step  10:  the  -procedure  had  to  be  applied  in  order  to 
continue  the  ranking.  The  ranking  order  within  the  significant  model  shows 
factor  £  to  be  by  far  the  more  important  of  the  two  factors. 

Note  6.10.  The  single  dropping  procedure  of  FEMO  resultB  in  a 
significant  model  for  yz  which  contains,  at  the  O.O5  level,  also  the  two 


degrees  of  freedom  representing  the  main  effect  of  the  duamy  factor  (J . 

Indeed,  dividing  DIFF  MS  <*  4.0072454  of  Step  8  in  FIMO,  cumulative  dropping, 
by  MS(2)  =  .68853333  yields  F  “  5-821  which,  with  1  and  12  degrees  of 
freedom,  is  marginally  significant  at  the  O.05  level.  However,  because  of 
this  marginal  significance  (which  actually  is  random,  aB  is  known  from  the 
construction  of  the  data!)  and  because  ALPHA{2)  =  0.01  waa  decided  upon  In 
advance  to  be  used  as  the  principal  significance  level,  one  may  say  that 
both  dropping  procedures  show  the  sane  significant  model  (containing  effects 
involving  3  and  C-  only). 

Note  6.11.  This  is  the  printout  of  the  ranking  information  for  the 
ANVA  of  OCIV  No.  1.  (This  and  the  FCA  1b  the  only  information  ever  given 
for  any  ANVA.  ) 

Note  6.12.  In  the  last  step  of  the  FE2iO-type  ranking  for  OCIV  No.  1, 
the  **  -procedure  had  to  be  applied.  (For  practical  purposes,  this  has  no 
influence  upon  the  ranking  here  since  effect  No.  3  ia  the  only  one  left 
not  yet  ranked  and,  thereby,  la  the  most  important  effect  by  definition.) 

The  last  three  I(X)-vadues  are  those  of  “Step  14",  "Step  14*",  and  "Step  lU**." 

Note  6.13.  In  the  identification  of  the  ANVA  printout  the  cardinal 
number  of  the  OCIV  or  the  symbol  of  the  dependent  variable,  Y,  is  given. 

Since  the  present  ANVA  is  that  for  OCIV  No.  1,  the  identification  "jZlOl"  is 
printed. 

Note  o.l4.  The  FCA,  ANVA  (cumulative  dropping)  for  OCIV  No.  1  shows  that 
the  factors  at;l  their  interactions,  in  the  present  example,  had  significant 
effects  upon  this  concomitant  independent  variable.  Except  for  the 
Interaction  CfS,  all  factorial  effects  contained  in  the  significant  model 
for  yA  in  the  analysis  of  covariance  (see  Note  6.5)  are  also  contained  in 
the  significant  model  (at  o  =  0.01)  for  OCIV  No.  J..  This  happens  because 
the  numerical  values  of  OCIV  No.  1  were  constructed  such  that  they  are 
highly  correlated  with  the  values  of  y±. 

Note  6.15.  Tills  is  the  FCA,  ANVA,  for  y^..  (The  preceding  ranking 
information  is  not  exhibited  here. )  In  other-  words,  the  FCA  shown  is  that 
which  would  have  been  obtained  for  yx  if  no  OCIVs  had  been  included  in  the 
F£M0  ranking.  (See  the  degrees  of  freedom  for  error:  DF(2)  =  12  which  is 
the  number  of  degrees  of  freedom  for  "within  cells.") 

The  ranking  order  within  the  significant  model  for  yx  alone  is 
slightly  different  from  that  obtained  in  the  analysis  of  covariance  (see 
Note  o.;>),  hut  both  significant  models  contain  the  same  set  of  effects.  It 
is  obvious  that  the  significance  of  the  factorial  effects  is  much  higher 
whe-n  OCIV  No.  1  is  excluded  from  the  model.  That  is,  the  present  cxiunple 
shows  how  the  use  of  covariates  can  cause  a  decrease  in  tic-  yov-r  of  the 
F-test  although  the  residual  variance  is  considerably  reduced  (from  MS(2)= 
l.odyfoo  wit  h  12  degrees  of  freedom  to  MS(2)  --  .11673482  with  11  degrees 


of  freedoa  In  the  preient  cnee).  Without  hiving  the  AHVAs  available  the 
analyst  would  not  know  whether  the  factors  had  effects  upon  the  covariate(a) 
nor  whether  the  sensitivity  of  the  analysis  was  decreased  by  perfuming  an 
analysis  of  covariance  ranking  rather  than  an  analysis  of  variance  ranking.. 

Note  6.16.  The  "Problem  Running  Time"  of  k  minutes  and  11  seconds  la 
that  for  both  dependent  variables  smd  includes  the  time  for  the  2  ANVAs. 
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3.4.7  Example  7 

The  purpose  of  exhibiting  Example  7  is  to  show  the  capability 
of  NOVACQM  in  dealing  with  unacceptable  inverses  of  the  matrices  (A)  of  the 
normal  equations.  In  order  to  show  this,  Example  6  was  modified  such  that 
a  singularity  is  introduced  into  the  matrix  A  of  rank  N+l.  This  was 
achieved  by  fitting  both  confounded  constants  ab12  and  bc££  while  abcm 
was  not  fitted,  which  again  makes  17  constants  fitted  as  in  Example  6. 

(In  the  proper  interpretation  considering  that  factors  Ct  and  &  are 
quantitative  and  in  the  NOVACCM  notation,  the  two  confounded  constants 
are  1.1  x  2*2  and  2*2  x  3.2;  see  also  the  reproduced  input  sheet. ) 

Because  of  the  purpose  mentioned,  the  problem  was  executed 
for  y1  only,  and  as  covariates  only  the  3  QCIVs  were  used,  i.e.,  no  CIVs 
were  generated  for  Example  7. 

The  numerical  values  for  yx  and  the  3  QCIVs  used  in  the 
present  case  are  the  same  as  in  Example  6.  Again  only  the  FCAs  are  shown 
in  the  reproduced  printout. 

Motes  on  printout  Example  7. 

Mote  7.1.  The  inverses.  A-1 ,  for  all  three  steps  in  COMO  were  rejected 
because  the  determinants  were  found  to  be  negative.  The  3  OCXVs  were 
deleted  "from  the  right":  in  the  order  of  input,  OCIV  No.  3  was  the 
"rightmost"  admissible  CIV  and,  therefore,  was  deleted  from  the  model  at 
the  first  step.  See  also  Flowchart  No.  4  in  Section  2.4.1. 

Note  7.2.  The  FCA  for  CCMO,  single  dropping,  shows  the  appropriate 
statement  for  this  case. 

Note  7 The  first  6  steps  in  FIWO  led  to  rejections  of  the  inverses 
because  only  at  Step  6  of  the  ranking  was  the  constant  2*2  x  3.2  deleted 
from  the  model  whereby  the  singularity  in  the  matrix  of  the  normal  equations 
was  eliminated.  Because  of  the  rather  arbitrary  deletion  of  effects  which 
are  'rightmost"  among  the  effects  admissible  for  ranking  at  a  given  stop, 
the  program  is  not  very  efficient  in  eliminating  the  singularity  at  the 
earliest  possible  step.  If  the  effect  2  x  3>2  (which  was  admissible  at  the 
first  step!)  had  been  deleted  at  the  first  step,  the  remaining  12  steps 
would  have  represented  a  genuine  FEMO  ranking.  However,  the  analyst  who 
meets  a  similar  situation  will,  no  doubt,  execute  the  problem  a  second  time 
after  correcting  for  the  cause  of  the  rejections.  The  results  of  the  first 
trial  will  usually  be  of  considerable  help  to  the  analyst  for  the  indicate  i 
correction.  In  the  present  example,  the  analyst  would  rightfully  suspect 
that  effect  2  x  3.2  caused  the  previous  model  rejections  and  he  would  take 
the  appropriate  corrective  action.  (The  cause  for  the  rejections  could  be 
other  than  in  the  present  example  where  it  was  assumed  that  the  analyst 
made  a  mistake  in  fitting  the  constants.  For  example,  the  cause-  may  be  the 
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Insufficient  accuracy  of  the  inversion  process.  One  nay  also  consider 
using  the  corrective  capability  of  NOVACCW  for  the  detection  of  confounding 
if  it  cannot  be  detected  otherwise.)  Note  that  DF(2)  =  19  at  Step  7  are  the 
12  degrees  of  freedcn  "within  cells"  pooled  with  the  7  degrees  of  freedom 
due  to  the  6  deleted  effects. 

Note  7.4.  The  FCA  of  FEMO,  single  dropping,  shows  only  the  "good" 
steps,  i.e.,  the  steps  at  which  Ihe  inverses  were  accepted. 
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4.  GLOSSARY  CF  TERMS  USED  IN  THE  REPORT 


The  page  numbers  In  the  following  alphabetical  glossary  give  the 
pages  where  the  main  definitions  are  introduced  (page  number  underlined) 
or  where  additional  pertinent  information  concerning  a  term  is  given 
(page  masher  r.ot  underlined).  The  glossary  is  not  a  complete  reference 
to  all  pig-?s  where  a  term  is  discussed  or  mentioned.  Rather,  the  glossary 
is  intended  $.s  a  guide  to  the  page  where  a  given  term  is  introduced. 


A  (»  matrix  of  normal  equations)  18 

A  (=  numbor  of  levels  of  factor  O)  4 

Additional  analysis  of  variance  (ANVA)  8,20 

"Additional"  regression  sum  of  squares  (■  K>, 14,17 

Admissibility  (of  CIV  or  effect  for  rankim)  14,15,20 

ALPHA  (KALFHA)  12 

ANVA  (=  Additional  Analysis  of  Variance)  8,20 

ASSR(N)  (-  "total"  regression  sum  of  squares  adjusted  10 

for  the  mean) 

ATSS  ( =  total  sum  of  squares  adjusted  for  the  mean)  10 

Automatic  Generation  (of  CIVs)  21 

Automatic  Generation  (of  DIVs)  25 

Auxiliary  independent  variable  (\\  )  £ 

B  (■  number  of  levels  of  factor  8)  4 

Backward  ranking  method  £,13 

CIV  (=  Concomitant  independent  Variable)  ^,21 

Coding  (of  OCIVs)  ^4 

Coding  (of  quantitative  factor  variables)  36 


COMO  (=  COnoamitant  Variables  Magnitude  [of  prediction 
power  for  y]  Ordering) 
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Complete  printout 
Compound 

Control  Card  4  Set 
Cumulative  dropping  (in  CCMO) 
Cumulative  dropping  (in  FIWO) 
Cumulative  ranking 
D  (=  order  of  DIV-model) 

Data  input 
Data  matrix 
Deletion  (of  CIVa) 

Deletion  (of  DlVa) 

Design  Independent  Variable  (DIV) 
Design  matrix 

DIV  ( -  Design  Independent  Variable) 
Effect  (=  factorial  effect) 
Factorial  effect 
Factor  number 
Factor  pair 


FCA  (=  Final  Comprehensive  Analysis) 

FEMO  (=  Factorial  Effects  Magnitude  [of  prediction 
power  for  y]  Ordering) 

Final  Comprehensive  Analysis  (FCA) 

Final  FCA 


First  good  step 


8,80,91 


2^,26 

2^,27 


Pull  data  matrix 


66 


Pull  effect  28 

Full  model  81 

Pull  printout  (at  significant  step  of  ranking)  18,19 

GCIV  (=  Generated  CIV)  £,21 

Generated  CIV  (GCIV)  £,21 

Generation  (of  CIVs)  21 

Generation  (of  DIVs)  CJ 

Good  model 

Good  step  i2 

Hand-generation  (of  CIVa)  2%  , 83 

Hand-generation  (of  DIVs)  8l 

I,  (=  A*1  A  =  computed  identity  matrix)  18 

ISUttC  12 

IV  (=  independent  Variable)  3 

I(X)  (=  Non-Significance)  12,  20 

KALFHA  12 

Level  number  25 

Main  Theorem  (of  multiple  regression)  10 

Matrix  of  normal  equations  (A)  18 

"Most  probable"  significant  r.udel  6,91 

N  (-  total  number  of  independent  variables)  i 

n  (=  total  number  of  observed  y-values)  10 
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1 


"Non- orthogonal"  analyais  of  variance 
Non-Significance  (I(X))  il 

NOVACCM  i 

OCIV  (-Original  CIV)  1>2° 

OCIV  number  21 

-  Order  (of  DIV)  £2. 

Original  CIV  (OCIV)  i>2° 

"Orthogonal"  analysis  of  variance  1 

P  (»  order  of  CIV-Bod el)  £1 

Partially  Fitted  Full  Effect  (PFFE)  17»£2 

PFFE  (-  Partially  Fitted  Full  Effect)  17 ,'H 

Power  (of  CIV)  £i>23 

Power  (of  quantitative  factor  variable)  £2 

Power -Bum 
* -procedure 

Qualitative  factor  Hi 

(Quantitative  factor  6 

(Quantitative  factor  variable  (X.iX^,...)  6 

(■  number  of  observations  in  cell  afly...)  Jj, 

Rejected  model  19 

Relaxed  admissibility  (of  effects  for  ranking)  17 ■  28 

Restricted  admissibility  (of  CIVs  or  effects  for  ranking)  H , 13 , 16 , 17 ,23,27 
Restriction-dependence  (of  SSH_N-  -values)  H* ,17 


Significant  model 


2,  IMS  is,  19 
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AM*. 


Single  dropping  13 

Single  dropping  (In  CCMO)  |*0 

Single  dropping  (in  FSiO)  i*2 

SS„_H.  («  "additional"  regression  sub  of  squares)  10 A1* >17 

Sub-CIV  16,22. 

Sub-DIV  16, 2T 

Sub-effect  1^,2. 

Simulation  matrix  19>3T 

T  (-  total  nuaber  of  covarlatcs)  1  2., 21 

TOLIl’  IS 

"Total"  regression  sub  of  squares  (-  A88R(H))  10 

TF  (-  total  number  of  OCIVs)  £1 

\iy  (^  vth  auxiliary  Independent  variable''  £,7 

Unrestricted  admissibility  (of  CIVa  or  effects  for  ranking;  17 

V  (»  number  of  dependent  variables  in  a  problem)  66 

X,  (=  quantitative  factor  variable  for  factor  (j)  6 

\  («=  quantitative  factor  v^x'able  for  factor  8)  6 

Zero  error  perfect  fit  68 
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Appendix  A: 


METHOD  OF  FITTING  CONSTANTS  FCR  NCIN- ORTHOGONAL 


LAYOUTS  WITH  INTERACTIONS  AND  SMFIY  CELLS 


The  method  proposed  in  this  Appendix  is  developed  for  the  case  of 
only  qualitative  factors  in  a  given  data  layout.  However,  an  extension 
to  cases  with  quantitative  factors  is  easily  possible.  The  method  of 
fitting  constants  is  treated  strictly  from  the  viewpoint  of  hypothesis 
testing.  Therefore,  emphasis  is  put  on  the  proofs  that  the  null 
hypotheses  are  testable  when  the  backward  ranking  technique  of  the 
factorial  effects  is  applied. 

In  order  to  introduce  some  of  the  concepts  of  the  proposed  method, 
the  two-way  crossed  classification  example  from  Section  2.1.1  of  the  present 
report  is  used  (Example  A  below)  together  with  two  modifications  (Examples  B 
and  C).  Then,  a  three-way  crossed  classification  example  (Example  D)  where 
all  cells  are  occupied  is  treated.  Finally,  all  the  essential  features  of 
the  method  are  exemplified,  in  a  combined  manner,  with  a  three-way  crossed 
classification  where  some  cells  are  empty  (Example  E). 

Example  A 


In  Figure  1  the  layout  of  Example  A  is  given  together  with  the  two 
marginal  one-way  classifications  for  factors  d  and  8 •  The  following  are 
some  of  the  concepts  and  symbols  which  are  used  for  the  various  features 
of  the  fitting  process:  A  cell  is  identified  by  the  sequence  of  the 
factor  level  symbols  (in  alphabetical  order  of  the  factors)  which  uniquely 
define  the  cell.  For  example,  the  cell  identification  for  the  cell  defined 
by  the  first  level  of  &  and  the  second  level  of  8  is  given  by  CliBz •  A 
distinction  is  made  between  "basic  cells"  and  "marginal  cells":  basic  cells 
are  those  of  the  basic  (original)  classification,  whereas  marginal  cells  are 
those  of  the  marginal  classifications  which  result  from  summing  over  all 
levels  of  at  least  one  factor.  For  example,  cell  Ciffz  in  Figure  1  is  a 
basic  cell,  and  cell  (7i  is  a  marginal  cell  (resulting  fran  summing  over 
the  5  levels  of  factor  8).  A  "row"  of  cells  is  defined  as  the  group  of 
cells  (basic  or  marginal)  which  is  formed  by  keeping  constant  the  levels 
of  all  but  one  factor  in  the  layout  of  the  basic  or  marginal  cells.  For 
exsmple,  by  Keeping  0=1  constant  in  Figure  1,  the  three  cells 
and  CrJ3i  form  a  row  of  basic  cells.  An  "X"  in  a  cell  (basic  or  marginal) 
means  that  the  cell  is  occupied,  i.e.,  that  there  are  observations  in  this 
cell.  A  circle  around  the  "X"  means  that  a  constant  (parameter)  has  been 
fitted  based  on  the  observation( s )  in  this  cell;  and  a  checkmark  (of  one  of 
three  types  to  be  defined)  through  an  "X"  means  that  a  constant  either  has 
not  been  fitted  or  can  not  be  fitted  based  on  the  observations  in  the 
eheckmarked  cell. 


A-l 


In  Figure  1  the  two  marginal  one-way  classifications  ("rows  of 
marginal  cells"  by  structure)  will  be  used  to  demonstrate  the  fitting  of 
main  effect  constants,  and  the  two-way  classification  (of  the  basic  cells) 
will  be  used  to  demonstrate  the  fitting  of  the  interaction  constants. 
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Figure  1:  Layout  of  Example  A. 


Legend; 

V  :  Type  I-Chectaaark 
(See  Rule  I) 


Type  II-Checkmark 
(See  Rule  Ha) 


Since  a  linear  restriction  has  to  be  imposed  on  each  set  of  main 
effect  constants  (i.e.,  for  and  8),  a  constant  based  on  the  observations 
in  one  of  the  (A=3  and  B=3)  marginal  cells  is  linearly  dependent  upon  the 
constants  based  on  the  observations  in  the  other  (two)  cells.  Generally, 
the  two  constants  do  not  have  to  be  based  on  the  first  two  cells  (as  done 
for  both  factors  £7  and  8  in  Figure  l).  However,  if  the  linear  restrictions 
of  the  Graybill  type,  an=bg=0,  are  chosen  as  suggested  in  Section  2.1.1  of 
the  present  report,  the  last  cell  can  not  be  used  as  a  basis  for  fitting  a 
constant  since  this  constant  is  eliminated  a  priori  from  the  model. 


A-2 


Under  the  corresponding  restrictions  for  the  interaction  constants, 
(abg,  =  abAj  =  0  for  or  =  0  =  the  same  argument  applies 

to  each  row  and  each  column  of  the  two-way  classification:  only  the 
interaction  constants  abn,  ab12,  ab21,  and  ab^  can  be  fitted.  This 
completes  the  full  set  of  AB-1=8  constants  contained  in  the  model  as 
given  in  equation  (2-5)  of  Section  2.1.1. 

The  argumentation  just  used  is  the  basis  for  the  fitting  of  mein 
effect  (uid  interaction  constants  by  visual  inspection,  according  to  which 
the  fitting  will  be  performed  from  here  on:  circles  are  used  (Figure  l) 
in  the  four  cells  ax,  as,  By,  and  Bs  to  indicate  that  main  effect  constants 
have  been  based  on  the  observations  in  these  cells.  As  a  consequence  of 
this  choice,  the  last  cells  in  both  rows  of  marginal  cells  have  to  be 
checkmarked.  "Checkmarking"  is  used  here  as  a  synonym  for  equating  to 
zero  the  constant  which  would  have  been  based  on  the  cell  if  it  had  been 
possible.  The  first  rule  for  checkmarking  a  cell  is  thus  stated: 

Rule  I.  If,  in  a  row  of  basic  or  marginal  cella,  all  but  one 
occupied  cell  have  been  circled  (where  the  choice  of  cells  to  be 
circled  is  up  to  the  analyst),  the  one  occupied  cell  left  will 
receive  a  "Type  I  -  checkmark":  see  legend  in  Figure  1. 

Note  that,  as  indicated  before,  it  would  have  been  possible  to  choose, 
for  example,  the  marginal  cells  <JX  and  Ct$  for  circling,  which  would  have 
left  cell  as  to  receive  a  Type  I-checkraark.  (This  would  have  implied 
a  set  of  linear  restrictions  different  from  the  Grayblll  type.) 

In  fitting  the  four  interaction  constants  in  the  two-way  layout  cf 
basic  cells  as  shown  in  Figure  1,  the  last  cells  in  the  1  rows  defined  by 
o=l,  o=2,  0=1,  and  0=2  also  receive  Type  I-chectoarks  according  to  Rule  I. 
Then,  the  only  occupied  cell  not  yet  considered  in  the  fitting  process  by 
visual  inspection  is  a^3 3-  The  reasoning  for  not  being  able,  in  the  visual 
process,  to  base  an  interaction  constant  on  this  cell,  given  the  four 
interaction  constants  have  been  fitted  aB  Indicated  in  Figure  1,  is  as 
follows.  Each  interaction  effect  (of  any  order),  to  be  represented  by  a 
fitted  constant,  must  be  interpretable  as  a  contrast  of  contrasts  and, 
therefore,  requires  two  occupied  cells  in  the  rows  of  (basic  or  marginal) 
cells  with  which  the  effect  is  to  be  associated.  Further,  if  two  occupied 
cells  in  a  row  are  available,  a  choice  must  obviously  exist  to  actually  base 
the  constant  on  the  one  or  the  other  occupied  cell.  This  feature  of  having 
the  choice  to  base  the  constant  on  either  one  of  the  cells  in  the  row  will 
be  called  "reversibility." 

Inspecting,  in  Figure  1,  the  row  of  basic  cells  defined  by  0=3,  one 
can  see  that  once  the  cells  c7pS3  and  aa33  are  checkmarked  (by  Rule  I), 
there  are  no  cells  left  (in  the  row  8= 3)  for  which  reversibility  exists. 
Therefore,  no  interaction  constant  can  be  based  on  cell  asBs,  and  it  is 
checkmarked  according  to  "Rule  Ha": 


Rule  Ila.  If,  in  a  row  of  basic  or  marginal  cells,  all  but  one 
cell  have  been  cheekmarked  according  to  Rule  I  such  that  no 
reversibility  (as  defined  above)  exists  for  the  one  cell  left, 
the  cell  will  receive  a  "Type  II-checkmark" :  see  legend  in 
Figure  1. 

(Rote:  A  "Rule  lib"  and  a  "Rule  lie"  according  to  which  the  Type  II- 
checkmark  will  again  be  applied,  are  given  in  the  discussions  of  Examples 
B  and  D,  respectively,  after  the  definition  of  a  "Rule  III.") 

Notice  that,  with  respect  to  cell  the  necessity  of  applying 

the  Type  II-checkraark  is  also  evidenced  by  the  two  Type  I-checkmarks  in 
the  row  defined  by  <y=3- 

This  completes  the  fitting  process  by  visual  inspection,  following 
the  established  Rules  I  and  XIa,  for  Example  A. 

Performing,  in  Example  A,  the  analysis  of  variance  corresponding  to 
the  backward  ranking  process  under  restricted  admissibility  (see  Section  2.1.2), 
the  only  admissible  null  hypothesis  at  the  first  step  is  the  null  hypothesis 
concerning  the  interaction  effect  <35,  l.e.,  Ho{*b11-abia=ab2i=abe2=0} .  This 
joint  hypothesis  is  testable  since  for  each  of  the  four  ab-cor.stants  there 
is  a  lines;  function  of  the  observations  having  the  particular  ab-constant 
(parameter)  as  expectation  under  the  given  model.  For  example, 

+y33p]  ■  ab12.  At  the  second  step,  provided  the  null 
<39  was  not  rejected  and  the  ab-constants  were  deleted 
fran  th*'  model,  the  null  hypotheses  about  the  main  effects  of  both  factors 
a  and  B  are  admissible.  Both  hypotheses  Ho{ai=a2»0]  and  Ho(b1=b2=0}  are 
testable  since  there  are  linear  functions  of  the  observations  which  have 
or  b2,  whichever  is  applicable,  as  expectation  under  the 

given  model. 

Considering  only  degrees  of  freedom,  the  AB-1=8  degrees  of  freedom 
"between  cells"  are  assigned  to  the  three  factorial  effects  as  follows,  as 
would  be  expected  for  a  layout  in  which  all  cells  are  occupied: 


a  2 

B  2 

_ C/9  4 

Total  =  "between  cells"  8 


A-4 


E[yi2a-yi3p-y3Ep 
hypothesis  about 


Example  B  is  a  modification  of  the  previous  Example  A:  the  (basic) 
cells  <2\ dd9$>  and  are  now  empty,  whereas  the  other  six  cells  are 

occupied  as  before;  see  Figure  2.  By  employing  the  fitting  process  by 
visual  inspection,  the  main  effect  constants  are  fitted  as  before  which 
consumes  k  of  the  now  5  available  degrees  of  freedom  "between  cells." 
Obviously,  this  time  only  1  interaction  constant  can  be  fitted. 

Inspection  of  the  two-way  layout  in  Figure  2  shows  that  cell  di&3 
is  the  only  occupied  cell  in  the  row  of  basic  cells  defined  by  0=3-  As 
stated  before,  any  fitted  constant  requires  the  availability  of  two 
occupied  cells  in  the  row  of  cells  with  which  the  effect  (  =  contrast), 
represented  by  the  fitted  constant,  is  to  be  associated.  Therefore, 
certainly  no  constant  can  be  based  on  a  single  occupied  cell  in  a  row  such 
as  <7i93  in  Example  B.  This  leads  to  the  simple  "Rule  III": 

Rule  III.  If,  in  a  row  of  basic  or  marginal  cells,  there  is 

only  one  cell  occupied,  this  cell  will  receive  a  "Type  Ill-checkmark" 

see  legend  in  Figure  2. 
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Figure  2:  Layout  of  Example  B. 
Legend: 

xV/  •  Tyne  II -Checkmark 

(See  Rules  Ila  and  lib) 


V  :  Type  Ill-Checkmark 
(See  Rule  III) 


Once  cell  (2x8$  in  Figure  2  is  checkmarked,  cell  <3 \Q\  remains  the 
only  occupied  cell  not  checkmarked  in  row  <*=1.  Therefore,  no  reversibility 
exists  for  the  two  occupied  cells  in  that  row,  and  it  ir  not  possible  to 
base  an  interaction  constant  on  cell  (7i/3i  either.  However,  unlike  in 
Example  A,  this  time  the  checkaarking  of  a  cell,  which  is  the  only  cell 
not  yet  checkmarked  in  a  row  of  cells,  is  not  a  consequence  of  fitting 
constants  by  choice  (i.e.,  of  circling  cells  in  other  rows),  but  a 
consequence  of  checkmarking  a  cell  when  no  alternative  exists.  This 
leads  to  the  definition  of  "Rule  lib": 

Rule  lib.  If,  in  a  row  of  basic  or  marginal  cells,  all  but  one 

cell  have  been  checkmarked  according  to  Rule  III,  the  one  cell 

left  will  receive  a  "Type  II-check»ark."  (See  legend  in  Figure  2.) 

The  fitting  process  by  visual  inspection  in  Example  fl  is  completed 
by  choosing  to  base  the  one  interaction  constant  on  cell  (7a®i»  for 
example.  This  is  done  in  Figure  2,  and  cell  (7&)\  i®  circled  accordingly. 

As  a  consequence,  cells  323a  and  3^Sx  are  checkmarked  following  Rule  I 
and  cell  la  checkmarked  following  Rule  Ila. 

At  the  first  atop  of  the  ranking  process,  the  only  admissible  null 
hypothesia  is  again  that  on  the  interaction,  Ho{abgi=0).  This  hypothesis 
is  testable  since  ETysMp-yoop-y*np-iy«>gp1  ■  abai  under  the  given  model. 

Example  C 

The  layout  of  Example  C,  as  given  in  Figure  3,  results  from 
Exancle  A  by  deleting  the  observations  in  the  four  cells  3x&z,  <7a®i» 

<3 and  <7362,  as  indicated. 
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Now  five  basic  cells  are  occupied  and  it  appears,  from  applying  Rules 
III,  I,  and  Ila,  (see  the  checkmarks  in  Figure  j),  that  five  constants 
can  be  fitted  as  is  indicated  by  the  circles.  (The  two  "l"s  in  the 
marginal  cells  dz  and  8S  will  be  explained  later.)  However,  actually 
only  1*  degrees  of  freedom  "between  cells"  are  available  to  be  assigned 
to  factorial  effects  The  reason  for  the  discrepancy  is  simple:  The 
deletion  of  the  observation( s )  in  cell  Ctgftz  wculd  lead  not  only  to  the 
loss  of  one  degree  of  freedom  "between  cells",  but  also  to  the  loss  of 
one  degree  of  freedom  for  each  of  the  main  effects  of  factors  and  s. 

In  other  words,  the  observation(s)  occupying  the  basic  cell  cause 

both  marginal  cells  (Jz  6111:1  to  be  occupied;  that  is  ,  deleting  all 
observations  from  cell  dz  also  deletes  all  observations  from  cell 
The  type  of  relation  among  non-empty  cells  (basic  or  marginal)  thus 
exemplified  will  be  expressed  in  an  algebraic  identity  containing  the 
symbols  (identifications)  of  the  cells  involved  in  the  relation,  that  is, 
in  the  present  example 

(!z  a  $2' 

Each  such  identity  represents  one  confounded  degree  of  freedom. 

That  is,  one  degree  of  freedom  of  all  those  factorial  effects,  whose 
constants  are  based  on  the  cells  represented  by  their  coll  symbols  in 
the  identity,  is  confounded.  In  Example  C,  therefore,  one  degree  of 
freedom  of  each  of  the  main  effects  is  confounded  since  the  constants 
an  and  b2  are  based  on  the  (marginal)  cells  Oz  and  respectively. 

For  this  reason,  cells  Oz  and  8z  arc  marked  with  an  "I"  (for  Identity) 
in  Figure  }.  Clearly  in  this  case,  either  a2  or  bK  can  be  fitted,  but 
not  both  simultaneously-  Therefore,  at  the  first  step  of  ranking  ill  this 
example,  Ho{abn=0}  will  be  tested  with  the  model  containing  either  the 
constants  alf  a^,  bj,  or  ai,  b1;  ba.  At  this  first  rtep  it  makes  no 
difference  whether  a^  or  b-  is  fitted  in  addition  to  ft;  and  bj,. 

Cnee  the  interaction  constant  »bu  is  deleted  from  the  model 
(assuming  that  Ho{&b1;i=0}  was  not  rejected)  the  fitting  of  or 
depends  upon  which  null  hypothesis  is  to  be  tested.  For  example,  in  order 
to  test  Ho(b1=0},  i.e.,  to  teat  the  hypothesis  that  there  is  no  main  effect 
due  to  factor  8  in  addition  to  the  main  effort  of  factor  d,  the  reduced  model 
must  contain  the  constants  ax  and  a^.  The  corresponding  argument,  holds  for 
the  testing  of  Ho{ni-0],  in  which  case  bj.  and  b^  must,  be  contained  in  the 
reduced  model .  belli  null  hypotheses  about  ax  and  bj  are  testable,  since,  for 
example,  El  .\'ij.p-yi30]  “  W  and  E[yu p-ymp]  =  &x  under  the  models  containing 
ai,  ‘‘o,  ui ,  and  a*,  bi,  b-,  respectively.  Naturally,  not  rejecting  the 
hypothesis  il,-(b1=0},  for  example,  uocs  not  imply  that  the  overall  main  effect 
of  9  is  not  significant,  but  it  docs  imply ,  given  the  pattern  of  empty  cells, 
•hat  th<*  iii'rVrcnoos  among  the  1  cell  means  can  sufficiently  be  explained  by 
the  ntuin  ••fleet  of  factor  <7  &lon«u  In  this  case,  once  the  constant  bx  lias 


been  deleted  from  the  model,  the  hypothesis  Hofa1=aa=o'j  la  testable.  A 
similar  argument  holds  for  the  case  of  not  rejecting  the  hypothesis  Ho{*i“0}> 
Rejecting  Ho{ai,=0}  or  means  that  the  corresponding  main  effect  is 

significant,  at  leaBt  based  on  the  one  unconfounded  degree  of  freedom. 

The  concept  of  identities  will  be  further  discussed  in  Example  E 

below. 


Example  D 

Example  D  is  the  basis  for  Example  E  which  will  be  used  to  demonstrate 
all  features  of  the  fitting  process  in  a  combined  manner.  Example  D  results 
from  Example  A  by  introduction  of  a  third  factor,  <3,  with  C-2  levels.  All 
AxBxC  =  3x3x2  cells  are  assumed  occupied,  and  the  layout  is  given  in  Figure  1*. 
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Figure  *4:  layout,  of  Example  1). 


The  three  marginal  u/ie-way  classifications  and  the  three  marginal  two-way 
classifications  are  not  shown  since  the  fitting  of  constants  and  testing 
of  null  hynotheses  for  main  effects  and  first  order  interactions  correspond 
to  those  shown  in  Example  A. 
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The  fitting  of  second-older  interaction  constants  (abc-terms)  by 
visual  inspection  follows  the  rules  established  before.  For  example,  in 
the  row  of  basic  cells  defined  by  0=2  and  y=l  only  two  abc-conatants  can 
bo  fitted.  Circling  the  cells  tfsfSs&i  (i.e.,  fitting  the 

constants  abc1B1  and  abc^i)  leads  to  the  checkmarks  in  cells  ftsCe, 

d  according  to  Rule  I.  The  other  two  constants  fitted  are 

abcm  and  abc21l,  and  the  checkmarks  in  the  remaining  cells  except 
are  applied  in  an  obvious  manner  following  Rules  I  and  II&.  The  checkmark 
(of  Typo  II )  in  cell  clsfy&z  is  applied  following  a  similar  reasoning  as 
that,  used  for  Rule  Ila:  Cell  < 'hB&z  is  the  only  occupied  cell  left  unmarked 
in  all  throe  rows  of  basic  cells  to  which  it  belongs.  In  these  three  towb 
(defined  by  a-3,.  0=3;  a=3 ,  Y=2;  and  0-3,  y=2)  all  other  cells  have  been 
eheekmarked  according  to  Rule  Ila  as  a  consequence  of  previous  checkmarking 
according  to  Rule  I  which  was  done  as  a  consequence  of  fitting  the  four 
at>  e-constants  as  indicated.  Therefore,  no  reversibility  exists  for  the 
remaining  cells  once  the  4  cells  as  indicated  are  circled,  and  cell  cTa/SaCc 
accordingly  la  also  eheekmarked.  This  argumentation  can  also  be  generalized 
to  higher-way  layouts  and  thus  leads  to  the  last  rule  to  be  defined  for  the 
fitting  process  by  visual  inspection: 

Rule  lie.  If,  in  &  row  of  basic  or  marginal  cells,  all  but  one  cell 
have  been  eheekmarked  according  to  Rule  Ila,  or,  as  a  consequence 
of  Rule  Ha,  according  to  the  present  Rule  lie,  such  that  no 
reversibility  (as  defined  before)  exists  for  the  occupied  cells 
of  the  row  concerned,  the  one  cell  will  receive  a  "Type  Il-checkmar];. " 

Rote.  Rules  Ila,  lib,  and  lie  could  be  combined  into  one  "Rule  ll"  which 
would  slate  the  following:  Any  coll  will  receive  a  Type  II-checkmark 
which  in  left  as  the  only  unmarxed  occupied  cell  in  a  row  of  basic  or 
marginal  cells  where  all  other  cells  have  been  eheekmarked  according  to 
Rale  1  or  ill  oi',  as  a  conaeijuence  of  Rule  I  or  111,  according  to  "Rule  II." 

After  finishing  the  fitting  process  for  Example  D,  the  choice  of 
the  Aw  abc-constants  fitted  can  be  seen  to  correspond  to  the  choice  of 
the  linear  restrictions  of  the  Qraybill  type.  The  restrictions  read,  for 
the  three-factor  interaction  constants:  abc^,.  -  0  for  all  (w,0 );  =  0 

for  all  (a,y)i  and  abcAgy  =  0  for  all  (0,y). 

At  the  first  step  of  the  ranking  process,  the  only  null  hypothesis 
admissible  is  that  .-'oncoming  the  three-factor  interaction  (J*C ,  and  obviously, 
this  hypothesis  is  testable. 

Example  E 

This  example  results  from  Example  D  by  deletion  of  the  observations 
cells  as  indicated  in  Figure  ta.  The  example  contains  all  the  essential 


in 


features  of  the  method  as  they  were  successively  introduced  in  Examples  A 
through  D.  Example  E,  therefore,  will  be  used  to  demonstrate  all  the 
essential  aspects  of  the  proposed  method. 

In  addition  to  the  layout  of  the  three-way  classification  given  in 
Figure  5a  (in  which  factor  C-  is  treated  as  a  "subclassification"),  Figures 
5b  and  5c  shew  the  same  classification  arranged  such  that  factors  6  and  Q, 
respectively,  are  treated  as  the  "subclassification. "  These  three  possible 
arrangements  are  convenient  for  the  demonstration  as  will  be  seen.  Figures 
5d,  5e,  and  5f  show  the  three  marginal  two-way  classifications  for  fitting 
03-,  £23- ,  5(3- interact ion  constants,  respectively. 
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Figure  5ftJ  Layout  of  Example  E. 


(C  as  "subclassification") 


Example  E.  (/?  as  "subclassification" ) 
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Figure  5d:  Example  E,  Interaction  <J3 
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A  convenient  first  approach  is  to  fit,  by  the  process  of  visual 
inspection,  all  constants  which  appear  as  if  they  can  be  fitted,  according 
to  the  rules  previously  established,  in  all  marginal  classifications  and  in 
the  basic  (three-way)  classification.  If,  in  this  way,  k  more  constants 
result  than  there  are  degrees  of  freedom  "between  cells",  then  exactly  k 
identities  must  exist  for  the  given  data.  Naturally,  if  the  number  of 
constants  thus  fitted  is  equal  to  the  number  of  degrees  of  freedom  "between 
cells",  identities  do  not  exist  for  the  given  data.  In  the  present  example, 
the  fitting  of  main  effect  constants  (the  three  marginal  one-way  classifi¬ 
cations  not  shown  for  this  example)  and  first-order  interaction  constants 
(see  Figures  5d-5f)  yields  2,  2,  and  1  constants  for  O,  8,  and  <3,  respectively; 
and  3,  2,  and  2  constants  for  03,  (XI.,  and  82-,  respectively.  In  order  to 
evaluate  the  possibilities  of  fitting  abc-constants,  consider  Figure  5ft- 
Four  cells,  as  indicated,  are  checkmarked  according  to  Rule  III.  For 
example,  cell  Osft&i  is  the  only  occupied  cell  in  the  row  of  basic  cells 
defined  by  p=2  and  y-1.  Cell  Od9;&z  is  checkmarked  following  Rule  lib 
(as  a  consequence  of  the  Type  Ill-checkmark  in  cell  Without 

fitting  an  abc-constant  first,  the  remaining  3  occupied  cells  can  not  be 
checkmarked.  If  cell  £7i/?iCa  is  chosen  as  the  basis  for  a  constant  fitted 
(i.e.,  for  abclu),  the  remaining  7  cells  can  be  checkmarked  following 
Rules  I,  Ila,  and  lie.  Summarizing  the  results  frem  the  fitting  process, 
the  following  degrees  of  freedom  are  preliminarily  assigned  to  the  7  factorial 
effects: 


a  2 

8  2 
C  1 
08  3 

02  2 
82-  2 
as c  i 

Total  13 

However,  only  13  basic  cells  are  occupied  (Figure  5a),  consequently,  there 
are  only  12  degrees  of  freedom  "between  cells."  Accordingly,  13-12-1 
identity  must  be  present  in  the  given  layout. 

The  search  for  the  identities,  if  these  are  not  obvious  as  in 
Example  C,  can  be  done  cither  systematically  or  by  trial  and  error.  For 
example,  the  analyst  can  systematically  delete  the  observations  of  each 
basic  and  marginal  cell,  cell  by  cell,  and  examine  the  numbers  of  constants 
which  eon  be  fitted  in  each  situation.  In  general,  this  examination  should 
give  sufficient  hints  as  to  the  "location"  of  t.he  identities.  On  the  other 
hand,  sr-nv  experience  witli  the  peculiarities  of  identities  wixl  ,nabl-.  the 
analyst,  to  find  the  identities  of  a  given  layout  much  faster  by  trial  and 
error.  In  the  present  example,  for  instance,  it  is  not  difficult  to  find 
that  deleting  the  observations  from  the  marginal  cells  and  /°~Ci  'ausos 
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the  marginal  cell  Ct&y  to  bc  empty  too.  Algebraically)  this  relationship 
is  expressed  by  the  identity 

+  B&i  *  G&x- 


This  identity  implies  the  following  (see  Figures  [/d-5f  in  which  the  3 
affected  cells  are  marked  with  an  "I"):  Che  degree  of  freedom  is  confounded 
in  each  of  the  three  twn-factor  interactions)  i.e.,  in  £39,  C&t  and  S3.  The 
three  constants  affo^^d  are  those  based  on  the  observations  in  the  three 
marginal  cells  whose  symbols  are  contained  in  the  identity)  i.e.)  ab2x, 
ac21,  and  bc2i.  Any  pair  of  these  three  constants  can  be  fitted;  the 
simultaneous  fitting  of  all  three  constants  is  not  possible  since  it  would 
lead  to  a  singular  matrix  of  the  normal  equations. 

The  above  identity  happens  to  contain  only  cell  symbols  for  which  no 
factor  is  at  its  last  level,  i.e.,  at  «sA=3,  P-B-3,  or  y=02.  This  absence 
of  last  levels  is  very  desirable  since  cell  symbols  at  the  laBt  level  of 
any  factor  can  not  be  associated  with  constants  to  be  fitted  because  these 
constants  are  deleted  a  priori  from  the  model  if  one  uses  the  suggested 
linear  restrictions  of  the  Graybill  type.  Whenever  applicable  and  possible, 
therefore,  the  levels  of  the  factors  should  be  interchanged  such  that  the 
identities  contain  only  cell  symbols  in  which  none  of  the  factors  is  at  its 
last  level.  The  interchanging  is  feasible  when  the  method  being  discussed 
is  applied  to  cases  with  only  qualitative  facturs  or  (quantitative)  factors 
which  are  treated  as  qualitative  factors.  Should  it  be  impossible  to  free 
the  identities  from  cell  symbols  at  last  factor  levels,  it  will  still  be 
possible  t.o  find,  for  'ftch  identity,  a  set  of  constants  *Ailch  can  not  be 
fitted  simultaneous]  (tlote  that,  for  the  proper  testing  of  null  hypotheses, 
a  set  of  constants  wnich  can  not  be  fitted  simultaneously  must  be  feund. ) 

The  search  for  this  set  of  constants  again  may  have  to  be  done  by  trial  and 
error,  i.e.,  by  observing  whether  or  not  the  matrix  of  the  normal  equations 
is  non-singular  while  trying  various  possibilities  of  fitting. 

There  may  be  more  than  one  identity  for  a  given  set  of  data.  However, 
all  identities  must  be  linearly  independent  from  each  other  in  order  to 
account  for  one  confounded  degree  of  freedom  each.  (The  latter  implies  that, 
in  a  system  of  identities,  the  cell  symbols  can  be  added  to  and  subtracted 
from  each  other,  which  is  stated  without  proof.)  For  instance,  two  or  more 
identities  are  linearly  dependent  when  they  can  be  added  to  yield  a  "trivial" 
identity.  A  trivial  identity  is  one  which  does  not  account  for  a  confounded 
degree  of  freedom.  In  the  present  Example  E  one  such  trivial  identity  is 
(see  Figure  5a) '. 


+  Cf^Sz  -  +  Q&e 


k-lk 


As  can  bo  seen,  with  cell  C!J3z  being  empty,  both  sides  of  the  identity 
are  equal  to  Rz- 

After  finding  the  identities  (if  pre.rn'..)  and  finding,  for  each 
Identity,  one  set  of  confounded  constants  which  '-tin  not  l>e  fluted 
simultaneously,  the  appropriate  null  hypotheses  can  bo  tested. 

In  the  present  example,  at  the  first  step  of  the  ranking  process, 
only  Ho{abci11=Q}  is  admissible  for  testing  (assuming  there  is  a  possibility 
for  testing,  i.e.,  there  exists  a  valid  estimate  of  the  experimental  error). 
This  hypothesis  is  testable  Blnee  there  is  a  linear  function  of  the  obser¬ 
vations  which  has  abclu  as  expectation,  under  the  model  containing  the 
constants  abem,  abu,  ab2a,  acn,  bcu,  ai,  as,  b3,  b2,  cj,  plus  any  two 
of  the  three  confounded  interaction  constants,  ab£1,  ac2i,  bc21: 

(yiup-yi3ip“yn2p+yi3sfp)  -  (y3np-y33ip-y3i2p+y332p)]  r  aiicm. 

Assuming  ahcm«0  to  be  true  (and,  eonr-:  ■pi"nt  ly ,  assuming.  abi'm  to 
have  been  deleted  from  the  model),  at  the  second  stop  of  tic  ranking 
process  for  this  example,  the  null  hypotheses  about  the  interaction  effects 
(X 9*  <!&,  6C-  axe  admissible  for  testing.  In  order  to  see  how  the  confounding 
will  affect  the  possibilities  of  testing,  a  list,  of  the  expected  values  of 
the  functions  indicated  below  is  advantageous.  For  the  present  investigation 
only,  the  model  Is  assumed  to  contain  all  seven  interaction  constants 
(besides  the  five  main  effect  constants),  that.  ir.  ab1A,  ab^i ,  ab22>  acllf 
ao-i,  ben,  and  be21 . 

The  construction  of  the  fUnctiais  Dj  (of  individunl  observations, 
yaeYp)»  having  the  desired  expected  values,  was  facilitated  by  inspection 
of  Figures  5a,  5b,  and  5c: 

■  Etyiup-y3iip-yi3ip+y331p]  -  &hj 
JtD2]  -  Sty222p-y32Ep-y23Cp+y3:jU'p]  nh. 

E^Da]  *  ^tyiiip-yiiEp-ysiip^yaiEp]  •  aelx 

EtD4]  =  E[y1np-y131p-yll20+y1yiJo]  i  h  ; 

E[Dv.]  "  E[y221p-y222o-y33ip+y3.,.Jp]  =  n-  .u  '  be21 

E[D0]  =  Etyiiip-y2iip-yi320t-yo:3.-.pl  -  ah  ^uen-ab^-.v;'! 

rtrv-,-1  =  Etys2ip-y2iip-y322p+y312pl  -  ub:v-b.  ; i-ub:  j*!.  ;  x 
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The  lut  two  functions  will  be  replaced  by  linear  combinations  with  other 
functions  much  that  the  expected  values  of  the  new  functions  contain  only 
confcmded  effects: 

“  absi+atEi 

E[IV]  B  E£ Da-D4-D/3  *  ahai-bca^ 


When  equating  to  zero  one  of  the  three  confauidea  constants,  i.e.,  ab2i, 
acai,  or  bcai,  one  can  see  that  the  null  hypotheses  on  all  three  interaction 
effects,  Of,  02,  and  32,  are  testable .  For  exaople,  if  abai  is  set  equal 
to  zero  (i.e. ,  if  ab21  is  deleted  fran  the  model),  the  sail  hypotheses 
Holabxi-abna-O],  Hofacn-acai-O} ,  end  Hofbcii-bcai«0]  are  testable  alnce 
the  functions  Dj.,  Da,  Dj,  D^,  04 ,  and  (-!>?)  have  tha  respective  constants 
as  sxpectations.  However,  if  according  to  the  Vest  result,  Ho{abn*abaa»0} 
does  not  have  to  be  rejected  and  is  assumed  to  be  valid,  one  does  not 
have  evidence  that  the  interaction  Of  is  not  significant.  The  only  valid 
conclusion  is,  given  the  pattern  of  empty  cells,  that  all  two-factor  inter¬ 
action  effects  (if  present)  can  sufficiently  be  explained  by  the  interactions 
02  end  32-  Corresponding  arguments  apply  when  acai  or  bcai  are  deleted  from 
the  model  and  Ho(acn-O)  or  Ho[bCn«Oj,  respectively,  are  assumed  to  be 
valid  on  grounds  of  the  test  resultj. 

If  each  of  the  null  hypotheses  on  the  three  interaction  effects  05, 

02,  and  32  has  to  be  rejected,  regardless  of  which  one  of  the  three 
confounded  constants  is  deleted  frera  the  model,  the  conclusion  is  clearly 
that  all  two-factor  interactions  are  significant  and  that  the  ranking  process 
has  reached  the  significant  model.  Naturally,  there  are  many  more  possible 
results,  all  of  which  con  not  b*  discussed  here,  when  testing  the  interaction 
effects  under  the  condition  of  the  confounding  as  contained  in  Example  E, 

For  example,  the  deleted  constaat  may  he  ac#l,  and  Hotbcunbcai-O}  may  be  the 
only  one  of  the  three  null  hypotheses  on  interaction  effects  which  does  not 
have  to  be  rejected.  This  also  would  mean  that  the  significant  model  is 
reached  In  the  ranking  process. 

Cnee  the  main  effects  in  Exaople  E  beccme  admissible  for  testing 
(l.s. ,  once  all  interaction  constants  have  been  deleted  from  the  model), 
tbalr  testing  is  straightforward  since  they  are  not  affected  by  identities. 

(Note.  For  a  numerical  Illustration  of  Example  E  see  Example  5  in 
Section  3. k. 5. ) 
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•  #  »R0N»0"'Na  MILITARY  ACTIV'T* 


The  report  contain*  the  description  of  a  program  ("NOVACOM")  for  the  aolution 
of  problem?  in  the  area  of  analysing  il>t«  baaed  on  the  general  linear  statistical 
model.  Wliile  the  detailed  program  documentation  le  given  elsewhere,  the  present 
publication  deals  with  the  statistical  method,  the  logical  flow,  and  the  use  and 
application  of  NOVACOM  in  multiple  linear  regression  and  ("non-orthogonal") 
analysis  of  varlanee  and  oovarlance  for  oroaaed  olaaslfloatlona  with  incomplete 
and  unbalanced  data.  The  method  of  NOVACOM  is  basically  a  backward  ranking 
procedure  applied  to  individual  and/or  groups  of  independent  variables  (concern! tani  ) 
Independent  variables  and/or  ANOVA  effects,  respectively) .  The  result  of  the 
ranking  is  a  model  ("significant  model")  which  contains  only  significant  oonoan'.ta.-P 
independent  variables  und/or  ANOVA  effects.  Thu  method  and  use  of  the  program  -s 
illustrated  by  examples  of  the  statistical  analysis  of  bodies  of  incomplete 
experimental  data. 
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